What you'll learn:
- Understand Data Engineering (Volume 1) on AWS using S3, Redshift, Athena and Hive
- Know Redshift, S3 and Athena up to Level 350+ with HANDS-ON
- Production level projects and hands-on to help candidates provide on-job-like training
- Get access to datasets of size 100 GB - 200 GB and practice using the same
- Learn Python for Data Engineering with HANDS-ON (Functions, Arguments, OOP (class, object, self), Modules, Packages, Multithreading, file handling etc.
- Learn SQL for Data Engineering with HANDS-ON (Database objects, CASE, Window Functions, CTE, CTAS, MERGE, Materialized View etc.)
This is Volume 1 of Data Engineering course on AWS. This course will give you detailed explanations on AWSData Engineering Services like S3 (Simple Storage Service), Redshift, Athena, Hive, Glue Data Catalog, Lake Formation. This course delves into the data warehouse or consumption and storage layer of Data Engineering pipeline. In Volume 2, I will showcase Data Processing (Batch and Streaming) Services.
You will get opportunities to do hands-on using large datasets (100 GB - 300 GB or more of data). Moreover, this course will provide you hands-on exercises that match with real-time scenarios like Redshift query performance tuning, streaming ingestion, Window functions, ACIDtransactions, COPYcommand, Distributed &Sort key, WLM, Row level and column level security, Athena partitioning, Athena WLMetc.
Some other highlights:
Contains training of data modelling - Normalization &ERDiagram for OLTP systems. Dimensional modelling for OLAP/DWHsystems.
Data modelling hands-on.
Other technologies covered - EC2, EBS, VPC and IAM.
This is Part 1 (Volume 1) of the full data engineering course. In Part 2 (Volume 2), Iwill be covering the following Topics.
Spark (Batch and Stream processing using AWSEMR, AWSGlue ETL, GCPDataproc)
Kafka (on AWS&GCP)
Flink
Apache Airflow
Apache Pinot
AWSKinesis and more.