Bypassing LLM Guardrails - Anti-Spotlighting and Best of N Attacks

Bypassing LLM Guardrails - Anti-Spotlighting and Best of N Attacks

Donato Capitella via YouTube Direct link

00:00 - Introduction

1 of 9

1 of 9

00:00 - Introduction

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Bypassing LLM Guardrails - Anti-Spotlighting and Best of N Attacks

Automatically move to the next video in the Classroom when playback concludes

  1. 1 00:00 - Introduction
  2. 2 02:12 - Get LLM Webmail Up and Running
  3. 3 03:42 - Initialize Spikee's Workspace
  4. 4 05:07 - Baseline Spikee's Prompt Injection Test
  5. 5 10:07 - Enable Guardrails System Message + Spotlighting
  6. 6 15:25 - Spikee's Anti-spotlighting Attack
  7. 7 28:17 - Prompt Injection Filters Azure Prompt Sheilds / Meta Prompt Guard
  8. 8 35:32 - "Best-of-N" Attack to Bypass Prompt Filtering
  9. 9 42:32 - Summary of Results

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.