Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CodeSignal

Transforming and Analyzing Data with AWS Glue & Athena

via CodeSignal

Overview

Build ETL pipelines with AWS Glue and PySpark, convert raw JSON to Parquet, and run fast analytics with Amazon Athena. Learn to manage data across raw, processed, and curated zones and automate workflows to deliver business-ready insights from your data lake.

Syllabus

  • Unit 1: Preparing Data for AWS
    • Complete the S3 Folder Structure
    • Upload JSON Data to S3
  • Unit 2: Understanding the ETL Script
    • Initialize Your First ETL Script
    • Extract Data from S3 Storage
    • Handle Missing Values in Data
    • Complete Your ETL Data Pipeline
    • Deploy ETL Script to Cloud Storage
  • Unit 3: Creating Glue ETL Jobs
    • Complete Your First Glue Job
    • Start Your First Job Run
    • Monitor Your Glue Job Runs
    • Check and Debug Your Glue Job Logs
    • Verify Your Parquet Output Files
  • Unit 4: Cataloging Data with Glue Crawler
    • Configure Your First Glue Crawler
    • Start Your Glue Crawler
    • Monitor Crawler Until Completion
  • Unit 5: Querying Data with Athena
    • Fix Your First Athena Query
    • Filter Overdue Books with SQL
    • Find Unique Patrons with DISTINCT
    • Handle Athena Query Failures Gracefully
    • Analyze Top Revenue Generating Genres
  • Unit 6: Aggregating Data with Glue
    • Adding Multiple Aggregation Functions
    • Changing Grouping Strategy for Business Insights
    • Debugging Broken Aggregation Functions

Reviews

Start your review of Transforming and Analyzing Data with AWS Glue & Athena

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.