Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Data Modeling and Lakehouse Architecture with SQL

Coursera via Coursera

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
You will design and implement enterprise-grade data models, from traditional star schemas to modern lakehouse architectures. This comprehensive course equips you with the skills to build cost-effective, scalable data solutions that drive business intelligence and analytics. You'll gain hands-on experience creating dimensional models with surrogate keys, optimizing database schemas through partitioning and clustering, and implementing slowly changing dimensions for historical data tracking. The course covers advanced topics like semantic metrics layers, multi-cluster warehouse architectures, and open-source table formats for data lakes. What makes this course unique is its end-to-end approach to modern data architecture. You'll work with real-world scenarios, from analyzing storage costs to designing data ingestion pipelines that span from raw files to analytics-ready tables. By completion, you'll confidently architect data solutions that balance performance, cost, and scalability—skills essential for senior data engineering and architecture roles in today's data-driven organizations.

Syllabus

  • Analyze Snowflake Schema Redundancies
    • You will examine existing snowflake schemas to pinpoint performance bottlenecks caused by redundant lookup paths and develop systematic approaches for identifying optimization opportunities.
  • Apply Star-Schema Dimensional Modeling
    • You will construct optimized star-schema dimensional models with proper fact and dimension table structures, implementing surrogate keys and design patterns that maximize query performance for analytical workloads.
  • Create Semantic Metrics Layer
    • You will develop standardized semantic metrics layers that ensure consistent business logic across analytics platforms, eliminate metric drift, and provide a unified source of truth for enterprise reporting.
  • Apply Partitioning and Clustering Strategies
    • You will implement advanced partitioning and clustering techniques using SQL DDL commands to optimize query performance for large-scale datasets.
  • Analyze Normalization vs Performance Trade-offs
    • You will evaluate database normalization levels against query performance requirements to make strategic denormalization decisions for optimizing analytical workloads.
  • Create Entity-Relationship Diagrams
    • You will design and document comprehensive Entity-Relationship diagrams that effectively communicate complex data structures and relationships for enterprise data systems.
  • Implement Data Pipelines for Historical Changes
    • You will build automated SCD Type 2 pipelines using MERGE statements and window functions to preserve historical data integrity in enterprise environments.
  • Analyze Storage and Compute Cost Trends
    • You will conduct comprehensive cost analysis of data lifecycle patterns to develop strategic archiving recommendations that balance storage economics with business value.
  • Create Multi-Cluster Warehouse Architecture
    • You will design scalable multi-cluster data warehouse architectures that isolate workloads for optimal performance while implementing comprehensive cost control and resource management policies.
  • External Table Configuration Mastery
    • You will learn the technical implementation of external table configurations to enable direct querying of file-based datasets in cloud storage.
  • Open-Source Table Format Analysis
    • You will develop analytical frameworks to evaluate and compare the technical capabilities of Delta Lake, Apache Iceberg, and Apache Hudi for specific business requirements.
  • Data Ingestion Pipeline Implementation
    • You will architect and implement automated data ingestion pipelines that orchestrate data movement across medallion architecture zones within lakehouse platforms.
  • Project: Data Modeling and Lakehouse Architecture with SQL
    • You will design and implement a comprehensive data lakehouse architecture that integrates dimensional modeling, schema optimization, cost management, and multi-format data ingestion. This project synthesizes advanced SQL skills to create a production-ready data engineering solution.

Taught by

Professionals from the Industry

Reviews

Start your review of Data Modeling and Lakehouse Architecture with SQL

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.