Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Transforming and Analyzing Data with AWS Glue & Athena

Go to class Write review

Details

Provider

CodeSignal
Pricing

Free Certificate
Languages

English
Certificate

Certificate Available
Effort

3 hours
Sessions

Self-Paced
Level

Intermediate

Found in

Part of

Data Engineering on AWS

Overview

Build ETL pipelines with AWS Glue and PySpark, convert raw JSON to Parquet, and run fast analytics with Amazon Athena. Learn to manage data across raw, processed, and curated zones and automate workflows to deliver business-ready insights from your data lake.

Syllabus

Unit 1: Preparing Data for AWS

Complete the S3 Folder Structure
Upload JSON Data to S3

Unit 2: Understanding the ETL Script

Initialize Your First ETL Script
Extract Data from S3 Storage
Handle Missing Values in Data
Complete Your ETL Data Pipeline
Deploy ETL Script to Cloud Storage

Unit 3: Creating Glue ETL Jobs

Complete Your First Glue Job
Start Your First Job Run
Monitor Your Glue Job Runs
Check and Debug Your Glue Job Logs
Verify Your Parquet Output Files

Unit 4: Cataloging Data with Glue Crawler

Configure Your First Glue Crawler
Start Your Glue Crawler
Monitor Crawler Until Completion

Unit 5: Querying Data with Athena

Fix Your First Athena Query
Filter Overdue Books with SQL
Find Unique Patrons with DISTINCT
Handle Athena Query Failures Gracefully
Analyze Top Revenue Generating Genres

Unit 6: Aggregating Data with Glue

Adding Multiple Aggregation Functions
Changing Grouping Strategy for Business Insights
Debugging Broken Aggregation Functions

Reviews

Start your review of Transforming and Analyzing Data with AWS Glue & Athena