Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

DataCamp

Data Pipeline Automation in Snowflake

via DataCamp

Overview

Load, automate, and optimize data pipelines in Snowflake using COPY INTO, Snowpipe, streams, tasks, dynamic tables, and query performance tools.

Master the tools and techniques for building reliable, automated data pipelines in Snowflake. You'll start by learning how to ingest data at scale — configuring stages and file formats, loading data with COPY INTO, and choosing between batch loading, Snowpipe, and Snowpipe Streaming for different latency and volume requirements.
From there, you'll build end-to-end pipeline orchestration skills: capturing row-level changes with streams, chaining multi-step workflows with task DAGs, and creating declarative, auto-refreshed pipelines with dynamic tables. You'll also learn when to reach for external and Iceberg tables for multi-engine and cloud-native scenarios.
The second half of the course sharpens your querying and transformation skills — extracting and unnesting semi-structured JSON from VARIANT columns, applying grouping extensions and window functions for advanced analytics, and encapsulating reusable logic in UDFs and stored procedures.
Finally, you'll tackle query performance: reading Snowflake's Query Profile to pinpoint bottlenecks, selecting the right optimization tool — Search Optimization, Query Acceleration Service, clustering keys, or materialized views — and writing cache-friendly SQL that avoids the common anti-patterns that silently degrade performance.

Syllabus

  • Data Loading, Ingestion, and Connectivity
    • Master Snowflake's data ingestion pipeline — from staging files through COPY INTO, Snowpipe, and Snowpipe Streaming — and connect Snowflake to the broader data ecosystem through connectors, drivers, and export tools.
  • Pipeline Orchestration and Data Objects
    • Build robust data pipelines using Snowflake-native orchestration — streams for change capture, tasks for scheduling, dynamic tables for declarative transformation, and external and Iceberg tables for open-format storage.
  • Querying and Transforming Data
    • Query semi-structured JSON with dot notation and FLATTEN, aggregate data with GROUPING SETS, ROLLUP, and CUBE, compute window functions, and encapsulate reusable logic in UDFs and stored procedures.
  • Query Performance and Optimization
    • Profile queries in Snowsight, identify bottlenecks with QUERY_HISTORY, apply Search Optimization, Query Acceleration Service, Automatic Clustering, and Materialized Views, and write SQL that benefits from Snowflake's three caching layers.

Taught by

Emily Melhuish

Reviews

Start your review of Data Pipeline Automation in Snowflake

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.