Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn to build production-grade LLM systems using AWS Bedrock, local inference toolchains, and systematic quality evaluation. You will explore retrieval-augmented generation (RAG) on AWS, configuring Bedrock knowledge bases with S3 data sources for document-grounded responses, and building Rust applications that interact with Bedrock model APIs. The course covers tokenization fundamentals, multi-model architectures for routing requests to appropriate foundation models, and the Bedrock knowledge agent workflow from data ingestion to response generation. You will compile llama.cpp with hardware-specific optimization flags, work with the GGUF file format for quantized model distribution, and deploy Qwen 2.5 Coder as a local coding assistant on AWS GPU instances. The local LLM toolchain module demonstrates Amdahl's law applied to parallel compilation, Bedrock provisioned throughput for dedicated model capacity, and prompt evaluation in the Bedrock console. You will use the UV package manager for Python dependency management in LLM projects and explore Amazon Q Developer for AI-assisted code generation and documentation. The course also covers SageMaker Canvas for no-code ML development, including dataset preparation and AutoML training. By completing this course, you will be able to design RAG pipelines on AWS, run optimized local LLM inference with llama.cpp, and evaluate LLM quality metrics for production deployments.