Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

DataCamp

Associate Data Engineer in Databricks

via DataCamp

Overview

In this track, you'll build the core data engineering skills used on Databricks. You'll start with SQL fundamentals: querying, aggregations, joins, window functions, common table expressions, and data manipulation. From there, you'll move into the Databricks platform, learn how to run analytics with Databricks SQL, and pick up the Python and Git basics needed to work in notebooks. You'll then explore the Lakehouse architecture, table management, and governance, before moving into scalable processing with PySpark and Spark SQL. The track concludes with hands-on practice building and troubleshooting transformation pipelines in Databricks.

Syllabus

  • Introduction to SQL
    • Learn how to create and query relational databases using SQL in just two hours.
  • Intermediate SQL
    • Accompanied at every step with hands-on practice queries, this course teaches you everything you need to know to analyze data using your own SQL code today!
  • Joining Data in SQL
    • Level up your SQL knowledge and learn to join tables together, apply relational set theory, and work with subqueries.
  • Data Manipulation in SQL
    • Master the complex SQL queries necessary to answer a wide variety of data science questions and prepare robust data sets for analysis in PostgreSQL.
  • Introduction to Databricks
    • Learn about the Databricks Lakehouse platform and how it can modernize data architectures and improve data management processes.
  • Introduction to Databricks SQL
    • Learn Databricks SQL for data engineering, analytics, and real-time data workflows in the lakehouse architecture.
  • Introduction to Python
    • Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.
  • Introduction to Git
    • Discover the fundamentals of Git for version control in your software and data projects.
  • Introduction to Databricks Lakehouse
    • Explore the Databricks Lakehouse - from medallion architecture and clusters to governance, sharing, and deployment.
  • Data Management in Databricks
    • Learn data management in Databricks with Delta Lake, including ACID transactions, schema enforcement, and security.
  • Introduction to PySpark
    • Master PySpark to handle big data with ease—learn to process, query, and optimize massive datasets for powerful analytics!
  • Introduction to Spark SQL in Python
    • Learn how to manipulate data and create machine learning feature sets in Spark using SQL in Python.
  • Data Transformation with Spark SQL in Databricks
    • Build end-to-end data pipelines - from cleaning and aggregation to streaming and orchestration.

Taught by

Hugo Bowne-Anderson, Mona Khalil, Mark Plutowski, Izzy Weber, Jasmin Ludolf, Yassin Alabdeen, Yusuf Saber, Maham Khan, Kevin Barlow, Smriti Mishra, George Boorman, Iason Prassides, Ben Schmidt, Disha Mukherjee, and Gang Wang

Reviews

Start your review of Associate Data Engineer in Databricks

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.