Completed
0:00 Advanced Data Preparation Techniques
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Advanced Data Prep and Visualization Techniques for Fine-tuning LLMs
Automatically move to the next video in the Classroom when playback concludes
- 1 0:00 Advanced Data Preparation Techniques
- 2 0:33 Video Overview
- 3 1:52 Synthetic Dataset Generation Goals
- 4 3:48 Synthetic Data Generation Pipeline
- 5 5:34 Document Ingestion Approaches e.g. pdf to markdown - comparing markitdown marker and Gemini
- 6 13:44 Chunking Approaches and Trade-offs
- 7 22:45 Question-Answer Pair Generation Approaches
- 8 31:56 Q-A pair visualization with embeddings or tags AND how to choose a model for synthetic data generation
- 9 44:29 How to create an Evaluation Dataset? Best Practice.
- 10 54:41 Preview of the upcoming fine-tuning video