Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This five-week, accelerated online specialization provides participants a hands-on introduction to designing and building data processing systems on Google Cloud Platform. Through a combination of presentations, demos, and hand-on labs, participants will learn how to design data processing systems, build end-to-end data pipelines, analyze data and carry out machine learning. The course covers structured, unstructured, and streaming data.
This course teaches the following skills:
• Design and build data processing systems on Google Cloud Platform
• Leverage unstructured data using Spark and ML APIs on Cloud Dataproc
• Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
• Derive business insights from extremely large datasets using Google BigQuery
• Train, evaluate and predict using machine learning models using Tensorflow and Cloud ML
• Enable instant insights from streaming data
This class is intended for developers who are responsible for:
• Extracting, Loading, Transforming, cleaning, and validating data
• Designing pipelines and architectures for data processing
• Creating and maintaining machine learning and statistical models
• Querying datasets, visualizing query results and creating reports
>>> By enrolling in this specialization you agree to the Qwiklabs Terms of Service as set out in the FAQ and located at: https://qwiklabs.com/terms_of_service
Syllabus
- Course 1: Build Data Lakes and Data Warehouses on Google Cloud
- Course 2: Build Batch Data Pipelines on Google Cloud
- Course 3: Build Streaming Data Pipelines on Google Cloud
- Course 4: Smart Analytics, Machine Learning, and AI on Google Cloud
Courses
-
In this course you will get hands-on in order to work through real-world challenges faced when building streaming data pipelines. The primary focus is on managing continuous, unbounded data with Google Cloud products.
-
Incorporating machine learning into data pipelines increases the ability to extract insights from data. This course covers ways machine learning can be included in data pipelines on Google Cloud. For little to no customization, this course covers AutoML. For more tailored machine learning capabilities, this course introduces Notebooks and BigQuery machine learning (BigQuery ML). Also, this course covers how to productionalize machine learning solutions by using Vertex AI.
-
The two key components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. This is the first course of the Data Engineering on Google Cloud series. After completing this course, enroll in the Building Batch Data Pipelines on Google Cloud course.
-
In this intermediate course, you will learn to design, build, and optimize robust batch data pipelines on Google Cloud. Moving beyond fundamental data handling, you will explore large-scale data transformations and efficient workflow orchestration, essential for timely business intelligence and critical reporting. Get hands-on practice using Dataflow for Apache Beam and Serverless for Apache Spark (Dataproc Serverless) for implementation, and tackle crucial considerations for data quality, monitoring, and alerting to ensure pipeline reliability and operational excellence. A basic knowledge of data warehousing, ETL/ELT, SQL, Python, and Google Cloud concepts is recommended.
Taught by
Google Cloud Training