- Learn about spark structured streaming and ways to optimize and use it to populate destination objects
At the end of this module, you're able to:
- Understand Spark structured streaming.
- Some techniques to optimize structured streaming.
- How to handle late arriving or out of order events.
- How to set up real-time-sources for incremental processing.
- Learn about structured streaming with Delta Live Tables
At the end of this module, you're able to:
- Use Event driven architectures with Delta Live Tables
- Ingest streaming data
- Achieve Data consistency and reliability
- Scale streaming workloads with Delta Live Tables
- Optimize performance with Spark and Delta Live Tables in Azure Databricks.
In this module, you learn how to:
- Use serverless compute and parallelism with Delta live tables
- Perform cost based optimization and query performance
- Use Change Data Capture (CDC)
- Apply enhanced autoscaling capabilities
- Implement Observability and enhance data quality metrics
- Implement CI/CD workflows in Azure Databricks
In this module, you learn how to:
- Implement version control and Git integration.
- Perform unit testing and integration testing.
- Maintain environment and configuration management.
- Implement rollback and roll-forward strategies.
- Automate workloads with Azure Databricks Jobs
In this module, you learn how to:
- Implement job scheduling and automation.
- Optimize workflows with parameters.
- Handle dependency management.
- Implement error handling and retry mechanisms.
- Explore best practices and guidelines.
- Manage data privacy and governance with Azure Databricks
At the end of this module, you're able to:
- Implement data encryption techniques
- Manage access controls
- Implement data masking and anonymization
- Use compliance frameworks and secure data sharing
- Use data lineage and metadata management
- Roll out governance automation
- Use SQL Warehouses in Azure Databricks
In this module, you'll learn how to:
- Create and configure SQL Warehouses in Azure Databricks.
- Create databases and tables.
- Create queries and dashboards.
- Run Azure Databricks Notebooks with Azure Data Factory
In this module, you'll learn how to:
- Describe how Azure Databricks notebooks can be run in a pipeline.
- Create an Azure Data Factory linked service for Azure Databricks.
- Use a Notebook activity in a pipeline.
- Pass parameters to a notebook.
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Syllabus
- Perform incremental processing with spark structured streaming
- Introduction
- Set up real-time data sources for incremental processing
- Optimize Delta Lake for incremental processing in Azure Databricks
- Handle late data and out-of-order events in incremental processing
- Monitoring and performance tuning strategies for incremental processing in Azure Databricks
- Exercise - Real-time ingestion and processing with Delta Live Tables with Azure Databricks
- Module assessment
- Summary
- Implement streaming architecture patterns with Delta Live Tables
- Introduction
- Event driven architectures with Delta Live tables
- Ingest data with structured streaming
- Maintain data consistency and reliability with structured streaming
- Scale streaming workloads with Delta Live tables
- Exercise - end-to-end streaming pipeline with Delta Live tables
- Module assessment
- Summary
- Optimize performance with Spark and Delta Live Tables
- Introduction
- Optimize performance with Spark and Delta Live Tables
- Perform cost-based optimization and query tuning
- Use change data capture (CDC)
- Use enhanced autoscaling
- Implement observability and data quality metrics
- Exercise - optimize data pipelines for better performance in Azure Databricks
- Module assessment
- Summary
- Implement CI/CD workflows in Azure Databricks
- Introduction
- Implement version control and Git integration
- Perform unit testing and integration testing
- Manage and configure your environment
- Implement rollback and roll-forward strategies
- Exercise - Implement CI/CD workflows
- Module assessment
- Summary
- Automate workloads with Azure Databricks Jobs
- Introduction
- Implement job scheduling and automation
- Optimize workflows with parameters
- Handle dependency management
- Implement error handling and retry mechanisms
- Explore best practices and guidelines
- Exercise - Automate data ingestion and processing
- Module assessment
- Summary
- Manage data privacy and governance with Azure Databricks
- Introduction
- Implement data encryption techniques in Azure Databricks
- Manage access controls in Azure Databricks
- Implement data masking and anonymization in Azure Databricks
- Use compliance frameworks and secure data sharing in Azure Databricks
- Use data lineage and metadata management
- Implement governance automation in Azure Databricks
- Exercise - Practice the implementation of Unity Catalog
- Module assessment
- Summary
- Use SQL Warehouses in Azure Databricks
- Introduction
- Get started with SQL Warehouses
- Create databases and tables
- Create queries and dashboards
- Exercise - Use a SQL Warehouse in Azure Databricks
- Module assessment
- Summary
- Run Azure Databricks Notebooks with Azure Data Factory
- Introduction
- Understand Azure Databricks notebooks and pipelines
- Create a linked service for Azure Databricks
- Use a Notebook activity in a pipeline
- Use parameters in a notebook
- Exercise - Run an Azure Databricks Notebook with Azure Data Factory
- Module assessment
- Summary