Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

AI Engineering: Fine-Tuning LLMs (with QLoRA, AWS, and Open Source)

via Zero To Mastery Path

Go to class Write review

Overview

Master the in-demand AI skill that businesses want: to build and deploy customized LLMs. Learn to fine-tune open-source LLMs on proprietary data and deploy your customized LLM models using AWS SageMaker and Streamlit.

Fine-tune open-source LLMs for custom business purposes
Deploy and scale models for enterprise purposes using AWS SageMaker and Streamlit
Understand and implement QLoRA from theory to code
Learn to preprocess proprietary datasets with chunking, tokenization, and attention masking
Monitor training and performance to ensure optimal business results
Manage cloud resources and optimize for cost
Apply advanced AI engineering techniques including quantization and more

Syllabus

Introduction

Course Introduction (What We're Building)
Exercise: Meet Your Classmates and Instructor
Course Resources
ZTM Plugin + Understanding Your Video Player
Set Your Learning Streak Goal

Setting up our AWS Account

Signing in to AWS
Creating an IAM User
Using our new IAM User
What To Do In Case You Get Hacked!

Setting Up AWS Sagemaker Environment

Creating a SageMaker Domain
Logging in to our SageMaker Environment
Introduction to JupyterLab
Let's Have Some Fun (+ More Resources)

Gathering, Chunking, Tokenizing and Uploading our Dataset

Sagemaker Sessions, Regions, and IAM Roles
Examining Our Dataset from HuggingFace
Tokenization and Word Embeddings
HuggingFace Authentication with Sagemaker
Applying the Templating Function to our Dataset
Attention Masks and Padding
Star Unpacking with Python
Chain Iterator, List Constructor and Attention Mask example with Python
Understanding Batching
Slicing and Chunking our Dataset
Creating our Custom Chunking Function
Tokenizing our Dataset
Running our Chunking Function
Understanding the Entire Chunking Process
Uploading the Training Data to AWS S3
Course Check-In

Understanding LoRA and Setting up HuggingFace Estimator

Setting Up Hyperparameters for the Training Job
Creating our HuggingFace Estimator in Sagemaker
Introduction to Low-rank adaptation (LoRA)
LoRA Numerical Example
LoRA Summarization and Cost Saving Calculation
(Optional) Matrix Multiplication Refresher
Understanding LoRA Programatically Part 1
Understanding LoRA Programatically Part 2
Unlimited Updates

Improving Training Speed with Bfloat 16

Bfloat16 vs Float32
Comparing Bfloat16 Vs Float32 Programatically
Implement a New Life System - at end of 3rd section

Setting up the QLoRA Training Script with Mixed Precision & Double Quantization

Setting up Imports and Libraries for the Train Script
Argument Parsing Function Part 1
Argument Parsing Function Part 2
Understanding Trainable Parameters Caveats
Introduction to Quantization
Identifying Trainable Layers for LoRA
Setting up Parameter Efficient Fine Tuning
Implement LoRA Configuration and Mixed Precision Training
Understanding Double Quantization
Creating the Training Function Part 1
Creating the Training Function Part 2
Exercise: Imposter Syndrome
Finishing our Sagemaker Script
Gaining Access to Powerful GPUs with AWS Quotas
Final Fixes Before Training

Running our Fine Tuning Script for our LLM

Starting our Training Job
Inspecting the Results of our Training Job and Monitoring with Cloudwatch

Deploying our Fine Tuned LLM

Deploying our LLM to a Sagemaker Endpoint
Testing our LLM in Sagemaker Locally
Creating the Lambda Function to Invoke our Endpoint
Creating API Gateway to Deploy the Model Through the Internet
Implementing our Streamlit App
Streamlit App Correction

Cleaning up Resources

Congratulations and Cleaning up AWS Resources

Where To Go From Here?

Thank You!
Review This Course!
Become An Alumni
Learning Guideline
ZTM Events Every Month
LinkedIn Endorsements

Taught by

Patrik Szepesi

Reviews

Start your review of AI Engineering: Fine-Tuning LLMs (with QLoRA, AWS, and Open Source)