Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Building Modern Data Applications Using Databricks Lakehouse

Packt via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
In today’s data-driven world, building scalable and efficient data applications is crucial for staying ahead in business and technology. This course explores the power of Databricks Lakehouse, a unified platform for managing and analyzing large volumes of data, and guides you through essential skills to create modern data applications. Throughout the course, you’ll learn to work with Delta Live Tables for data transformation, management, and quality assurance. You will also dive deep into Databricks’ Unity Catalog for enhanced governance, data lineage, and location management. The hands-on experience with deploying and maintaining DLT pipelines using Terraform prepares you for real-world data infrastructure challenges. This course stands out by combining theoretical understanding with practical, real-world applications. You’ll gain a robust set of skills in data pipeline management, governance, and monitoring, preparing you for building production-level data applications with Databricks Lakehouse. Designed for professionals looking to deepen their expertise in modern data architecture, this course is suitable for data engineers, data scientists, and IT professionals who want to leverage Databricks to solve real-world data problems.

Syllabus

  • An Introduction to Delta Live Tables
    • In this section, we explore real-time data pipelines with Delta Live Tables (DLT), analyze Delta Lake architecture, and design scalable streaming solutions for lakehouse environments
  • Applying Data Transformations Using Delta Live Tables
    • In this section, we cover ingesting data with DLT, applying changes, and configuring pipelines for scalability.
  • Managing Data Quality Using Delta Live Tables
    • In this section, we explore implementing data quality expectations in DLT pipelines, validating data integrity with temporary datasets, and quarantining poor-quality data for correction.
  • Scaling DLT Pipelines
    • In this section, we cover scaling DLT pipelines through cluster optimization, autoscaling, and Delta Lake techniques.
  • Mastering Data Governance in the Lakehouse with Unity Catalog
    • In this section, we explore implementing data governance in a lakehouse using Unity Catalog, focusing on access controls, data discovery, and lineage tracking for compliance and security.
  • Managing Data Locations in Unity Catalog
    • In this section, we cover managing data storage locations in Unity Catalog with secure governance and access control.
  • Viewing Data Lineage Using Unity Catalog
    • In this section, we explore data lineage in Unity Catalog, tracing origins, visualizing transformations, and identifying dependencies to ensure data integrity and proactive issue detection.
  • Deploying, Maintaining, and Administrating DLT Pipelines Using Terraform
    • In this section, we cover deploying and managing DLT pipelines using Terraform in Databricks.
  • Leveraging Databricks Asset Bundles to Streamline Data Pipeline Deployment
    • In this section, we explore Databricks Asset Bundles (DABs) for streamlining data pipeline deployment, emphasizing GitHub integration, version control, and cross-team collaboration.
  • Monitoring Data Pipelines in Production
    • In this section, we explore monitoring data pipelines using Databricks, focusing on health, performance, and data quality. Techniques include DBSQL alerts and webhook triggers for real-time issue resolution.

Taught by

Packt - Course Instructors

Reviews

Start your review of Building Modern Data Applications Using Databricks Lakehouse

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.