Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to build, deploy, and scale AI applications using Hugging Face—the platform powering over 2.5 million machine learning models used by Google, Meta, Microsoft, and thousands of organizations worldwide.
This hands-on specialization takes you from navigating the Hugging Face Hub to building multi-modal AI systems that process text, images, and audio. You'll master the Transformers library, learn to evaluate and select models for production use, fine-tune pre-trained models on custom datasets, and deploy your work to the Hub for others to use.
Through realistic role-play scenarios—including a startup investor demo and healthcare document triage system—you'll apply these skills to solve authentic industry problems. Whether you're building chatbots, content analyzers, transcription systems, or computer vision applications, this specialization provides the practical foundation you need to ship AI-powered products using open-source tools.
All projects run locally on your hardware (CPU, NVIDIA GPU, or Apple Silicon), requiring no cloud API costs—a critical skill for cost-conscious teams and privacy-sensitive applications.
Syllabus
- Course 1: Hugging Face Hub and Ecosystem Fundamentals
- Course 2: Fine-Tuning Transformers with Hugging Face
- Course 3: Large Language Models with Hugging Face
- Course 4: Advanced Fine-Tuning in Rust
- Course 5: Production ML with Hugging Face
Courses
-
Master the essential skills to build production-ready applications powered by large language models in this course. You'll learn to control text generation with precision using sampling parameters and stopping criteria, design effective prompts with chat templates for instruction-tuned models, build retrieval-augmented generation (RAG) pipelines that enable LLMs to access external knowledge, and extract structured data through constrained generation and function calling. What makes this course unique is its hands-on approach to practical LLM application development. You'll work directly with popular open-source models like Llama, Mistral, and Phi, progressing from basic text generation to sophisticated agent systems. Unlike theoretical courses, you'll build real systems—a semantic search engine with sentence-transformers, a complete RAG-powered question-answering pipeline, and tool-using agents that can execute functions based on LLM reasoning. Whether you're developing chatbots, automating information extraction, or building AI assistants, this course equips you with battle-tested patterns and techniques used in production LLM systems. You'll gain the confidence to choose the right approach for your use case and the skills to implement it reliably using the Hugging Face ecosystem.
-
Master the complete fine-tuning pipeline—from transformer internals to production deployment—using memory-efficient techniques that run on consumer hardware. This course transforms you from someone who uses large language models into someone who customizes them. You'll learn to fine-tune 7-billion parameter models on a laptop GPU using QLoRA, which reduces memory requirements from 56GB to just 4GB through intelligent quantization and low-rank adaptation. What sets this course apart is its rigorous, scientific approach. You'll apply Popperian falsification methodology throughout: instead of asking "does my model work?", you'll systematically try to break it. This skeptical mindset—testing tokenization edge cases, running rank ablation studies, and validating corpus quality through six falsification categories—builds the critical thinking skills that separate production-ready engineers from those who ship fragile systems. By course end, you'll confidently: calculate VRAM requirements and select appropriate hardware; trace inference through the six-step transformer pipeline; configure LoRA rank to match task complexity; build quality training corpora using AST extraction; and publish datasets to HuggingFace with proper splits and documentation. Built entirely on a sovereign AI stack, everything runs locally with no external dependencies—true ML independence.
-
Learn to deploy ML models to production using the Sovereign Rust Stack—a pure Rust implementation with zero Python runtime dependencies. This hands-on course teaches you to work with three critical model formats (GGUF, SafeTensors, APR), implement MLOps pipelines with CI/CD and observability, and deploy models across GPU, CPU, WebAssembly, and edge targets. Through real-world projects including a Python-to-Rust transpiler (Depyler), browser-based speech recognition (Whisper.apr), and LLM inference benchmarking (Qwen), you'll master format conversion, cryptographic model signing, and performance optimization. The course culminates in a capstone project deploying Qwen2.5-Coder across all three formats with benchmarking. What makes this course unique: instead of relying on Python frameworks, you'll build with production-grade Rust tooling that compiles to native binaries and WebAssembly. Learn to run sub-millisecond inference in browsers, bundle models into executables, and achieve 2x performance gains over standard tools. Ideal for ML engineers and software developers ready to move beyond notebooks into production deployment.
-
Master the Hugging Face ecosystem—the leading open-source platform for machine learning. This hands-on course teaches you to discover, evaluate, and deploy pre-trained models for text, image, and audio tasks without training from scratch. You'll learn to navigate the Hugging Face Hub to find models among 500,000+ options, read model cards to make informed selections, and understand licensing for commercial use. Through practical exercises, you'll build inference pipelines using the Transformers library, process datasets efficiently with streaming for large-scale data, and deploy models across different hardware (NVIDIA GPUs, Apple Silicon, CPU). The course culminates in building a multi-modal content analyzer that classifies text sentiment, categorizes images, transcribes audio, and generates captions—demonstrating how modern ML practitioners leverage pre-trained models to solve real problems quickly. Designed for developers and data scientists who want to accelerate their ML workflows, this course provides the foundation for fine-tuning and deploying Hugging Face models in production environments. All exercises use real-world scenarios from healthcare, fintech, and media industries.
Taught by
Alfredo Deza, Liam Parker and Noah Gift