Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Zero To Mastery

The Dark Side of AI: Jailbreaking, Injections, Hallucinations & more

via Zero To Mastery

Overview

Step over to the dark side and learn about the vulnerabilities, exploits, and unintended consequences that AI models like LLMs suffer from, with hands-on prompting and exercises.
  • What jailbreaking models involves and how to do it yourself
  • Understanding vulnerabilities inherent to models, including prompt and data leakage
  • The risks of exposing LLMs to proprietary or sensitive data
  • Exploring the toxicity and bias inherently built into different models
  • Real-world tests using ChatGPT, DeepSeek and other models
  • Experiment with steering an LLM's neurons to prevent hallucinations

Syllabus

  •   Introduction
    • Welcome to The Dark Side (Intro to Guardrails and Jailbreaking)
    • Exercise: Meet Your Classmates and Instructor
    • Course Resources
  •   The Dark Side of AI
    • Jailbreak! (The DAN Prompt)
    • Exercise: Create Your Own Jailbreak
    • Many Shot Jailbreaking
    • Prompt Injections - Part 1
    • Prompt Injections - Part 2
    • Thinking Like LLMs - Multi-Modal Injection
    • Leaking - Part 1 (Prompt Leaking)
    • Leaking - Part 2 (Data Leaking)
    • Exposure
    • Poisoning
    • Toxicity
    • Hallucinations
    • Thinking Like LLMs - Big vs Small
    • Challenge: Conduct Your Own Mechanistic Interpretability Research on Hallucinations
    • Challenge Instructions
    • Leaderboard: Mechanistic Interpretability
    • The Model Card
    • Model Cards Deep Dive
    • Exercise: Explore the Model Card for GPT-o3-mini and Learn Something New!
  •   Where To Go From Here?
    • Let's Keep Learning Together!
    • Review This Byte!

Taught by

Scott Kerr

Reviews

Start your review of The Dark Side of AI: Jailbreaking, Injections, Hallucinations & more

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.