Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Microsoft

Data Processing, Exploratory Analysis and Visualization

Microsoft via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This course introduces distributed computing frameworks and big data visualization techniques. Learners will explore MapReduce, work with Apache Spark, implement transformations with PySpark, and use Spark SQL for large-scale analysis. The course concludes with building compelling dashboards and reports using Power BI for actionable business insights. By the end of this course, you will be able to: - Explain distributed computing and MapReduce concepts - Process large datasets using Apache Spark and PySpark - Apply Spark SQL for advanced queries and transformations - Create dashboards and visualizations using Power BI Tools & Software: Apache Spark, PySpark, Azure Databricks, Power BI Skills: Distributed computing, Data analysis, PySpark, Spark SQL, Data visualization

Syllabus

  • Distributed Computing and MapReduce Concepts
    • Distributed Computing and MapReduce Concepts explores the foundational principles that enable modern organizations to process massive datasets that have outgrown the limits of single-machine computing. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll examine how data is broken into parallel tasks and executed across clusters of machines, how the Map, shuffle, and Reduce phases work together, and how common MapReduce patterns—such as counting, filtering, joining, and aggregation—solve practical big data problems efficiently and at scale.
  • Apache Spark Architecture and Fundamentals
    • Apache Spark Architecture and Fundamentals provides a comprehensive introduction to the distributed processing engine that revolutionized big data analytics by overcoming traditional MapReduce limitations. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll examine Spark's core components, including the driver, executors, and cluster manager, explore how in-memory processing delivers dramatic performance improvements, and learn to configure and manage Spark clusters and applications for efficient large-scale data processing.
  • Data Processing with PySpark RDDs and DataFrames
    • Data Processing with PySpark RDDs and DataFrames focuses on practical data processing using PySpark's Python API for Apache Spark. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll implement data processing operations using both RDDs and DataFrames, develop transformation pipelines, apply common data cleaning and preparation techniques, and optimize PySpark code for better performance across enterprise-scale big data scenarios.
  • Advanced Data Processing with Spark SQL
    • Advanced Data Processing with Spark SQL introduces Spark SQL as a powerful interface for structured data processing in distributed environments. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll master SQL operations at scale, from basic queries to complex analytical operations, learn to create and manage temporary views and tables, and optimize query performance for production workloads that would overwhelm traditional database systems.
  • Data Visualization for Big Data with Power BI
    • Data Visualization for Big Data with Power BI introduces comprehensive visualization techniques specifically designed for big data environments using Microsoft Power BI. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll learn to connect Power BI to various big data sources, create effective visualizations for large datasets, build interactive dashboards that enable self-service analytics, and implement best practices for handling performance challenges when visualizing massive datasets.

Taught by

Microsoft

Reviews

Start your review of Data Processing, Exploratory Analysis and Visualization

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.