Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Udemy

Mastering Big Data: Spark, Scala, Kafka, Hadoop,Hive & More

via Udemy

Overview

Complete Hands-On Developer Guide to Modern Big Data Tools like Hadoop, Hive, Spark, Scala, Kafka, NIFI, HBase and more

What you'll learn:
  • Understand the architecture of Hadoop
  • Understand file formats and the ability to choose the right format for a given use case
  • Develop applications on local system and then deploy them into production
  • Parameterize the code and make it production ready
  • Import data from mysql database into sqoop. Export data from hdfs to mysql. Get a deep understanding of sqoop
  • Query and analyze the data effectively using Hive. Get a strong understanding of hive
  • Learn Scala - one of the top programming languages
  • Learn basic, intermediate and Advance concepts of Spark which is very hot in the market
  • Work with complex data and learn how to process them effectively
  • Learn Cassandra and integrate it with Spark
  • Learn HBase and integrate it with Spark
  • Learn Apache NIFI
  • Work with Spark Streaming - Learn about Kafka and how it integrates with Spark
  • Get a good understanding of end to end big data pipeline
  • Interview Kit - Hive , Hadoop, Scala and Spark

Big Data is not difficult because of tools — it is difficult because engineers don’t understand how the pieces fit together.

Most courses teach commands.
This course teaches engineering thinking.

This is a complete, end-to-end learning path where you will build data pipelines the same way they are built in real companies — starting from fundamentals and gradually moving into performance tuning, troubleshooting, and production deployment.

Instead of isolated topics, you will understand why each technology exists, when to use it, and how they integrate into a real system.

By the end of this course you will be able to design, build, debug and optimize large-scale data workflows confidently.

What you will learn

• Understand the Big Data ecosystem and how modern data platforms are structured
• Work with distributed storage and processing systems from the ground up
• Build batch and streaming pipelines and integrate multiple data sources
• Design schemas and choose correct storage formats based on use case
• Develop applications using industry-relevant programming practices
• Move data between relational and distributed systems
• Process complex datasets and optimize performance
• Deploy applications to a cluster and make them production-ready
• Troubleshoot failures and analyze performance bottlenecks
• Prepare for real Big Data engineering interviews

Why this course is different

• Focus on understanding instead of memorizing commands
• Covers complete workflow — development → debugging → deployment
• Teaches practical decision-making used in real projects
• Includes troubleshooting and performance tuning (often missing in courses)


Syllabus

  • Introduction to the course
  • Introduction to the Big Data World
  • Setting up Cluster and doing hands on with Hadoop
  • Sqoop
  • Hive
  • Installation for Spark and Scala
  • Let's learn Scala
  • Introduction to Spark
  • Spark RDDs
  • Spark DataFrames
  • Spark Advance
  • Productionalizing your Code
  • Complex Data Processing
  • NOSQL Databases
  • Apache NIFI
  • Working with Streaming Data
  • Extra

Taught by

Deesa Technologies

Reviews

4.5 rating at Udemy based on 341 ratings

Start your review of Mastering Big Data: Spark, Scala, Kafka, Hadoop,Hive & More

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.