Overview

This five-week, accelerated online specialization provides participants a hands-on introduction to designing and building data processing systems on Google Cloud Platform. Through a combination of presentations, demos, and hand-on labs, participants will learn how to design data processing systems, build end-to-end data pipelines, analyze data and carry out machine learning. The course covers structured, unstructured, and streaming data. This course teaches the following skills: • Design and build data processing systems on Google Cloud Platform • Leverage unstructured data using Spark and ML APIs on Cloud Dataproc • Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow • Derive business insights from extremely large datasets using Google BigQuery • Train, evaluate and predict using machine learning models using Tensorflow and Cloud ML • Enable instant insights from streaming data This class is intended for developers who are responsible for: • Extracting, Loading, Transforming, cleaning, and validating data • Designing pipelines and architectures for data processing • Creating and maintaining machine learning and statistical models • Querying datasets, visualizing query results and creating reports >>> By enrolling in this specialization you agree to the Qwiklabs Terms of Service as set out in the FAQ and located at: https://qwiklabs.com/terms_of_service

Syllabus

Course 1: Build Data Lakes and Data Warehouses on Google Cloud
Course 2: Build Batch Data Pipelines on Google Cloud
Course 3: Build Streaming Data Pipelines on Google Cloud
Course 4: Smart Analytics, Machine Learning, and AI on Google Cloud

Courses

0 reviews
8 hours 11 minutes
View details

In this course you will get hands-on in order to work through real-world challenges faced when building streaming data pipelines. The primary focus is on managing continuous, unbounded data with Google Cloud products.
0 reviews
6 hours 36 minutes
View details

Incorporating machine learning into data pipelines increases the ability to extract insights from data. This course covers ways machine learning can be included in data pipelines on Google Cloud. For little to no customization, this course covers AutoML. For more tailored machine learning capabilities, this course introduces Notebooks and BigQuery machine learning (BigQuery ML). Also, this course covers how to productionalize machine learning solutions by using Vertex AI.
0 reviews
10 hours 3 minutes
View details

The two key components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. This is the first course of the Data Engineering on Google Cloud series. After completing this course, enroll in the Building Batch Data Pipelines on Google Cloud course.
0 reviews
10 hours 53 minutes
View details

In this intermediate course, you will learn to design, build, and optimize robust batch data pipelines on Google Cloud. Moving beyond fundamental data handling, you will explore large-scale data transformations and efficient workflow orchestration, essential for timely business intelligence and critical reporting. Get hands-on practice using Dataflow for Apache Beam and Serverless for Apache Spark (Dataproc Serverless) for implementation, and tackle crucial considerations for data quality, monitoring, and alerting to ensure pipeline reliability and operational excellence. A basic knowledge of data warehousing, ETL/ELT, SQL, Python, and Google Cloud concepts is recommended.