Courses from 1000+ universities
$7.2 billion in combined revenue since 2020. $8 billion in lost market value. This merger marks the end of an era in online education.
600 Free Google Certifications
Computer Science
Psychology
Microsoft Excel
Lean Production
Viruses & How to Beat Them: Cells, Immunity, Vaccines
Learn Like a Pro: Science-Based Tools to Become Better at Anything
Organize and share your learning with Class Central Lists.
View our Lists Showcase
Explore data pipeline observability using OpenLineage, enhancing reliability and auditability. Learn how metadata collection enables understanding of data flow and dependencies across teams and technologies.
Explore conversational analytics with Apache Spark, enabling natural language queries for data insights. Learn about NLP-to-SQL translation and its integration with dialog management and visualization.
Optimize legacy Spark applications for 3.0: leverage Adaptive Query Execution, columnar processing, GPU acceleration, and ML integration. Concrete examples and lessons from modernizing analytics workloads.
Techniques to create configurable, maintainable data pipelines. Learn to externalize configurations, validate inputs, and leverage Scala features for robust, easily deployable ETL processes.
Discover how Databricks simplifies data science with a unified platform for analytics workloads, from data preparation to predictive analytics, enhancing collaboration and productivity at scale.
Discover sensitive data in Apache Spark and Databricks using BigID's platform. Learn to scale discovery, label data, and apply necessary guardrails for compliance and risk mitigation in data analytics.
Explore Lyft's hybrid Apache Spark architecture using YARN and Kubernetes, addressing scalability challenges and optimizing workload management for improved reliability and efficiency.
Optimizing Spark SQL jobs through parallel and async IO techniques, including file-level and row group-level parallel read, asynchronous spill, and parquet column family, resulting in 5-30% performance improvement.
Explore Databricks SQL Analytics: set up SQL endpoints, query workspaces, create dashboards, and connect BI tools. Learn how this tool enhances the Lakehouse architecture for SQL-savvy data analysts.
Explore lightweight data validation for Spark using Fugue and Pandera, enabling partition-specific rules and efficient big data processing without compromising on functionality.
Designing an intelligent, event-driven data platform for EFSA's digital transformation, enhancing transparency in food safety risk assessment through Azure and Databricks technologies.
Explores advanced techniques for efficient large-scale language model training on GPU clusters, discussing parallelism methods, novel scheduling, and performance optimization to achieve unprecedented model sizes and training speeds.
Explore how GSK uses knowledge graphs and Apache Spark to build a massive medical database, enabling advanced drug and vaccine discovery through innovative data querying and machine learning techniques.
Explore Delight, a free monitoring dashboard for Apache Spark. Learn to troubleshoot and optimize data engineering pipelines using system metrics and Spark information for improved performance and cost-effectiveness.
Explore Apache Spark 3.1's new features: ANSI SQL compliance, streaming enhancements, Python improvements, and performance optimizations for faster, easier, and smarter data processing.
Get personalized course recommendations, track subjects and courses with reminders, and more.