Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Production ML with Hugging Face

Pragmatic AI Labs via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to deploy ML models to production using the Sovereign Rust Stack—a pure Rust implementation with zero Python runtime dependencies. This hands-on course teaches you to work with three critical model formats (GGUF, SafeTensors, APR), implement MLOps pipelines with CI/CD and observability, and deploy models across GPU, CPU, WebAssembly, and edge targets. Through real-world projects including a Python-to-Rust transpiler (Depyler), browser-based speech recognition (Whisper.apr), and LLM inference benchmarking (Qwen), you'll master format conversion, cryptographic model signing, and performance optimization. The course culminates in a capstone project deploying Qwen2.5-Coder across all three formats with benchmarking. What makes this course unique: instead of relying on Python frameworks, you'll build with production-grade Rust tooling that compiles to native binaries and WebAssembly. Learn to run sub-millisecond inference in browsers, bundle models into executables, and achieve 2x performance gains over standard tools. Ideal for ML engineers and software developers ready to move beyond notebooks into production deployment.

Syllabus

  • Model Formats
    • Understanding ML model formats and the Sovereign AI Stack. Learn GGUF, SafeTensors, and APR formats for different deployment targets.
  • MLOps Foundations
    • Production infrastructure for ML systems. This module covers the essential MLOps practices needed to deploy and maintain ML models in production environments. Learn how to implement CI/CD pipelines specifically designed for ML workflows, set up comprehensive observability with logs, metrics, and traces, apply cryptographic model signing for supply chain security, and choose optimal deployment patterns based on your infrastructure requirements.
  • Project Showcase
    • Real-world projects built with the Sovereign AI Stack. This module demonstrates practical applications through three production projects: Depyler (a Python-to-Rust transpiler with self-improving ML), Whisper.apr (speech-to-text in browser and CLI), and the APR ecosystem tools. Learn how to build self-improving systems using compiler-in-the-loop training, deploy speech recognition to resource-constrained environments, and leverage the full APR toolchain for model conversion and inference.
  • Capstone Project
    • Final project deploying Qwen2.5-Coder-0.5B across all three model formats. Students demonstrate mastery of format conversion, CLI deployment, server deployment, and performance benchmarking.

Taught by

Noah Gift

Reviews

Start your review of Production ML with Hugging Face

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.