LLM Engineering That Works: Prompting, Tuning, and Retrieval

Coursera via Coursera Specialization

Go to class Write review

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off

One annual plan covers every course and certificate on Coursera. 40% off for a limited time.

Get Full Access

LLM Engineering That Works is an advanced, multi-course professional certificate designed to prepare you for production-grade AI systems. The program combines five long courses, 19 short courses (including career development), and four add-on modules to cover the end-to-end lifecycle of large language models—from prompt design and model tuning to retrieval, evaluation, and deployment. You’ll learn to build robust, ethical, and cost-efficient LLM solutions through hands-on projects that mirror real-world workflows. By completing this program, you’ll gain the skills needed to design, implement, monitor, and improve scalable LLM-enabled applications across industries. Who this is for: Experienced ML engineers, software engineers, data scientists, and AI practitioners who seek hands-on, production-focused expertise in LLMs and AI systems. A strong programming background and familiarity with ML concepts are recommended.

Syllabus

Course 1: Production AI Model Development and Ethics
Course 2: Building Reliable LLM Systems
Course 3: Testing and Refining LLM Applications
Course 4: Designing Production LLM Architectures
Course 5: Evaluating LLM Performance and Efficiency
Course 6: Advancing Your Career in Production AI

Courses

0 reviews

View details

This course prepares you to leverage your advanced skills in model development, optimization, and AI ethics to pursue senior career opportunities. You will learn how to strategically position yourself for roles such as Senior Machine Learning Engineer, MLOps Engineer, and AI Ethics Specialist by building a portfolio that showcases your end-to-end project experience and preparing for advanced technical interviews focused on system design and responsible AI.
0 reviews

View details

Building Reliable LLM Systems is a comprehensive course for AI practitioners looking to move beyond basic models and create production-grade applications. While getting an LLM to generate text is easy, ensuring a consistently accurate, relevant, and trustworthy output is a significant engineering challenge. This course provides a systematic framework for tackling the entire lifecycle of LLM reliability. You will start by learning to quantitatively evaluate model performance using a suite of lexical and semantic metrics, such as BLEU, ROUGE-L, and cosine similarity. You’ll dive deep into debugging, using log analysis and data manipulation to uncover the root causes of critical failures, such as hallucinations, by correlating them with retrieval system performance. The course emphasizes statistical rigor, teaching you to design and analyze A/B tests, apply hypothesis testing, and calculate confidence intervals to prove the significance of your optimizations. Finally, you’ll optimize the foundational data layers, learning to tune SQL queries and vector search parameters to achieve the perfect balance between recall and latency.
0 reviews

View details

This course is for ML engineers, solutions architects, and senior developers who build robust infrastructure powering large language models. This course teaches you how to design, deploy, and maintain the complex, interconnected systems required for scalable, resilient, and cost-effective LLM applications in the real world. You will learn to think like an architect, starting with foundational design choices. Using sequence diagrams and structured analysis, you will compare synchronous and asynchronous architectures and evaluate the critical trade-offs between self-hosting open-source models and using managed APIs, considering total cost of ownership, latency, and data privacy. The course then dives deep into building for resilience and scale, applying the 12-factor app methodology to design stateless, configurable microservices. You’ll learn to analyze multi-region deployment strategies for fault tolerance and to use container orchestration manifests like Helm to deploy scalable applications capable of handling production workloads. Finally, you’ll master the data backbone of your system by designing automated data pipelines with tools like Airflow and learning to manage the complexities of schema evolution.
0 reviews

View details

This comprehensive course is for product managers, ML engineers, and technical leads responsible for transforming LLM concepts into reliable, cost-effective production services. In today's AI-driven landscape, building a functional model is only the beginning. You will learn the complete framework for measuring, documenting, and optimizing LLM applications to ensure that they deliver real business value efficiently and consistently. The course begins by grounding you in product-centric development, teaching you to create a clear Product Requirements Document (PRD) that defines scope, MVP features, and success metrics. You'll evaluate features against acceptance criteria to identify gaps and validate user requirements. You will evaluate Zero-Shot, Few-Shot, and Chain-of-Thought prompt patterns and develop runbooks for vector index management. You will learn to analyze compute-spend reports to propose concrete cost-reduction strategies, such as model quantization, and use value-stream mapping to identify and eliminate inefficiencies in your development and release pipelines.
0 reviews

View details

This comprehensive program provides end-to-end training on the production machine learning lifecycle, designed to take your models from experiment to deployment. You’ll progress from applying feature engineering pipelines with scikit-learn and selecting models through rigorous evaluation, to optimizing PyTorch models with custom training loops and advanced diagnostics. Finally, you will master the principles of responsible AI by creating model cards and auditing systems for ethical compliance. By the end of this course, you will be able to build, tune, and deploy efficient, reliable, and ethical AI solutions. These skills are essential for ML engineers who develop and maintain robust, production-grade machine learning systems.
0 reviews

View details

This course is designed for software engineers and ML practitioners aiming to advance from building LLM prototypes to deploying robust, production-grade AI systems. In the real world, a reliable application requires more than a clever prompt; it demands a rigorous software engineering foundation to ensure its testability, maintainability, and safety. This course provides that critical toolkit. You will learn to apply Test-Driven Development (TDD) to methodically build and refactor LLM-powered microservices, ensuring that your code is clean and verifiable from day one. To safeguard your applications, you will create sophisticated behavioral test suites that enforce safety policies and prevent undesirable outputs. You'll go a step further by using mutation testing to evaluate the quality of your own tests, ensuring that your safety guardrails are truly effective. The course also dives into the MLOps lifecycle, teaching you to version datasets and models with DVC, track experiment results on platforms like W&B, and make data-driven decisions about the models to promote. Finally, you will learn to automate your entire testing and evaluation workflow using powerful Python scripts, thereby preparing your application for seamless integration into a CI/CD pipeline.