- Introduction to the Open source Analytics Offering
At the end of this module, you will understand:
- What HDInsight is
- How HDInsight works
- When to use HDInsight
- Choose the correct HDInsight Configuration to build open source analytics solutions
At the end of this module, you understand:
- The correct HDInsight configuration options.
- Decision criteria for selecting the correct HDInsight configuration option.
- Analyze a scenario and map it to an HDInsight configuration option.
- Cost Optimization strategies for HDInsight clusters.
- Creating and configuring a HDInsight cluster.
In this module you:
- Create an HDInsight Spark Cluster.
- Execute queries on an HDInsight Spark Cluster.
- Monitor an HDInsight Spark Cluster.
- Learn how to fix common provisioning issues.
- Run Petabyte level OSS NoSQL databases with HDInsight HBase
- Introduction
- Use HDInsight HBase clusters
- Describe HBase Architecture Patterns
- Exercise - Provisioning a HDInsight HBase cluster
- Exercise – Run benchmarks in HBase
- Understand HBase Best Practices
- Summary
- Knowledge Check
- Perform advanced streaming data transformations with Apache Spark and Kafka in Azure HDInsight
At the end of this module, you understand:
- When to use Apache Spark and Kafka with HDInsight.
- Spark Structured Streaming.
- The architecture of a Kafka and Spark solution.
- How to provision HDInsight, create a Kafka producer, and stream Kafka data to a Jupyter notebook.
- How to replicate data to a secondary cluster.
- Perform Zero ETL analytics with HDInsight Interactive Query.
In this module, learn:
- Appropriate scenarios to deploy HDInsight Interactive Query clusters.
- Learn about architectural patterns.
- Deploy a cluster for a real-estate app and query the data.
- Learn how to integrate Apache Spark and Hive LLAP queries using the Hive Warehouse Connector.
- Create a large-scale interactive query dashboard to evaluate real estate values and locations.
- Learn how to manage enterprise security in HDInsight.
- Introduction
- Describe HDInsight security areas
- Implement Network Security
- Understand Operating system security
- Manage Application/ Middleware security
- Implement Data Access security
- Knowledge Check
- Summary
Learn Backend Development Part-Time, Online
Start speaking a new language. It’s just 3 weeks away.
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Syllabus
- Introduction to the Open source Analytics Offering
- Introduction
- What is HDInsight?
- How does HDInsight work
- When to use HDInsight
- Module assessment
- Summary
- Choose the correct HDInsight Configuration to build open source analytics solutions.
- Introduction.
- HDInsight configuration options
- Decision criteria for selecting the correct HDInsight configuration option
- Analyze a scenario and map it to a HDInsight configuration option.
- Cost optimization strategies for HDInsight clusters
- Module assessment
- Summary
- Creating and configuring a HDInsight cluster.
- Introduction
- Creating an HDInsight cluster
- Exercise - Create an HDInsight cluster via the Azure portal
- Opening a Jupyter Notebook on HDInsight Spark cluster
- Exercise - Execute queries on HDInsight Spark cluster
- Enable monitoring of HDInsight jobs.
- Common provisioning Issues
- Exercise - Monitor an HDInsight cluster
- Summary
- Knowledge check
- Run Petabyte level OSS NoSQL databases with HDInsight HBase
- Introduction
- Describe Apache HBase
- Explain HDInsight HBase clusters architecture and application patterns
- Improve the write and read performance of HBase clusters
- Determine migration and high availability strategies in HDInsight HBase
- Use Apache Phoenix on HDInsight HBase
- Determine HDInsight HBase cluster performance
- Perform benchmarking in HBase
- Module assessment
- Summary
- Perform advanced streaming data transformations with Apache Spark and Kafka in Azure HDInsight
- Introduction
- Use HDInsight Spark and Kafka
- Stream data with Apache Kafka
- Describe Spark structured streaming
- Create a Kafka and Spark architecture
- Exercise - Provision HDInsight to perform advanced streaming data transformations
- Exercise - Create the Kafka producer
- Exercise - Stream Kafka data to a Jupyter notebook and window the data
- Replicate data to a secondary cluster
- Module assessment
- Summary
- Perform Zero ETL analytics with HDInsight Interactive Query
- Introduction
- When should you use HDInsight Interactive Query?
- HDInsight interactive queries
- Exercise - Provision HDInsight to perform adhoc analytics
- Exercise - Upload and query data in HDInsight
- Integrate Apache Spark and Hive LLAP queries
- Create a large scale interactive query dashboard for Evaluating Real Estate Trends
- Summary
- Knowledge check
- Manage enterprise security in HDInsight
- Introduction
- Describe HDInsight security areas
- Implement Network security
- Understand operating system security
- Manage application/ middleware security
- Implement data access security
- Module assessment
- Summary