Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Columnar Storage and Query Optimization

Edureka via Coursera

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Every data professional writes SQL queries — but few understand why some queries take seconds and others take minutes on the same data. The answer lies beneath the surface: in how data is stored, how query engines read that data, and how columnar formats like Parquet fundamentally change the game for analytics performance. This course gives you that understanding. You will start from the foundations — how computers store and read data, how SQL operations access data internally, and what distinguishes row-based storage from columnar storage. From there, you will explore modern columnar formats (Parquet, ORC), work with DuckDB as your primary analytics query engine, and learn to read execution plans to diagnose exactly where queries slow down. Each concept is reinforced through hands-on demonstrations that you can follow along on your own setup. By the end of this course, you’ll be able to: - Explain how computers store data, distinguish between row-based and columnar storage, and identify when columnar formats provide a performance advantage. - Work with Parquet and ORC file formats, compare them to CSV, and query columnar data using DuckDB. - Read and interpret SQL query execution plans using EXPLAIN, and diagnose performance bottlenecks in analytical workloads. - Apply real-world query optimization techniques including column pruning, filter pushdown, partitioning, data skipping, and before-vs-after performance comparison. This course is designed for a diverse audience: Data Analysts who want to understand why their queries are slow, junior Data Engineers building foundational storage knowledge, BI Professionals moving into performance engineering or platform roles, and SQL Developers who want to go beyond writing queries to understanding how queries execute internally. Basic computer literacy is helpful. No prior SQL experience is required — though familiarity with basic statements will help you move faster. Stop guessing why queries are slow. Start understanding storage, execution, and optimization — and build the foundational skills every modern data team needs.

Syllabus

  • Foundations of Data Storage and SQL for Analytics
    • This module introduces how data is stored and organized in computer systems using files, tables, rows, and columns. It explains how SQL is used to access and manipulate data and how databases process read operations. The module also compares row-based and column-based storage to show how different storage models affect query performance.
  • Columnar Storage in Modern Industry Systems
    • This module explains how columnar storage is used in modern data systems and data warehouses for efficient analytics. It introduces common columnar formats and tools used in industry, and demonstrates how techniques like compression, data skipping, and partitioning improve query performance.
  • Query Engines and SQL Processing Systems
    • This module introduces query engines and SQL tools used by analysts and engineers to process data. It explains how SQL queries are executed internally and how query plans represent the steps a system takes to run a query. The module also compares different query engines to understand why some systems perform faster for analytical workloads.
  • Query Optimization Concepts and Best Practices
    • This module explains why query optimization is important for improving data processing efficiency and reducing slow query performance. It introduces practical SQL optimization techniques such as filtering, column pruning, and efficient aggregations. The module also demonstrates real-world optimization workflows using industry tools to compare query performance before and after optimization.
  • Course Wrap-Up Assessment
    • This module consolidates key concepts from data storage, SQL querying, query execution, and optimization. It evaluates understanding through structured assessments and practical query analysis scenarios. It serves as a final checkpoint to assess readiness for real-world data querying and performance optimization tasks.

Taught by

Edureka

Reviews

Start your review of Columnar Storage and Query Optimization

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.