This learning path prepares you for a career as a Google Cloud Data Engineer. You'll start with data engineering fundamentals, then build data lakes and warehouses using BigQuery, BigLake, and Apache Iceberg. Next, you'll design both batch and streaming data pipelines with Dataflow, Serverless Spark, Pub/Sub, and Cloud Composer. Three dedicated courses cover serverless data processing with Dataflow from foundations through development and operations. You'll also learn to use Gemini AI models in BigQuery for advanced analytics and sentiment analysis. The path concludes with certification preparation for the Google Cloud Professional Data Engineer exam, featuring diagnostic questions across all key domains.
Overview
Syllabus
- Preparing for your Professional Data Engineer Journey
- Prepare for the Google Cloud Professional Data Engineer certification with diagnostic questions covering migration, storage, analytics, and automation.
- Introduction to Data Engineering on Google Cloud
- Learn the data engineering role on Google Cloud. Explore data sources, storage solutions, ETL/ELT architectures, BigQuery, Dataform, and Dataproc.
- Build Data Lakes and Data Warehouses on Google Cloud
- Build modern data lakehouses on Google Cloud using BigQuery, Cloud Storage, Apache Iceberg, BigLake, federated queries, and data governance tools.
- Build Batch Data Pipelines on Google Cloud
- Explore streaming data architectures on Google Cloud with Pub/Sub, Managed Kafka, Dataflow, and BigQuery for real-time data processing.
- Build Streaming Data Pipelines on Google Cloud
- Design and operate batch data pipelines on Google Cloud using Dataflow, Serverless Spark, Cloud Composer, and data validation techniques.
- Serverless Data Processing with Dataflow: Foundations
- Master Apache Beam and Dataflow foundations including portability, Runner v2, Shuffle Service, Streaming Engine, IAM, quotas, and security.
- Serverless Data Processing with Dataflow: Develop Pipelines
- Develop data pipelines with Apache Beam and Dataflow. Cover transforms, windowing, I/O connectors, schemas, state APIs, Beam SQL, and notebooks.
- Serverless Data Processing with Dataflow: Operations
- Operate Dataflow pipelines in production. Learn monitoring, logging, troubleshooting, performance tuning, CI/CD, reliability, and templates.
- Boost Productivity with Gemini in BigQuery
- Use Gemini AI to boost your productivity in BigQuery. Explore data, accelerate code development, and discover visualization workflows.
- Work with Gemini Models in BigQuery
- Work with Gemini AI models in BigQuery for sentiment analysis. Analyze customer reviews using SQL and Python notebooks with Gemini.
Taught by
Google Cloud