Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

SDG Hub - An Open-Source Toolkit for Synthetic Data Generation and LLM Customization

DevConf via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore SDG Hub, an open-source toolkit developed at Red Hat for customizing language models through synthetic data generation in this 36-minute conference talk. Learn what synthetic data means in the context of large language models and discover how it enables effective model customization for specific use cases. Dive into SDG Hub's core architectural components including prompts, blocks, and flows, and understand how to compose, extend, or modify pipelines to meet particular task requirements. Master strategies for selecting appropriate teacher models based on different use cases such as reasoning, translation, and other specialized applications. Follow along with two comprehensive real-world examples: constructing a document-grounded skill using pre-built pipelines, and customizing a reasoning model by creating new blocks, prompts, and flows while integrating custom teacher models. Witness a live demonstration of the new SDG Hub graphical user interface, designed to enable non-experts to visually construct and manage their own synthetic data pipelines without requiring deep technical expertise. Gain practical insights into leveraging synthetic data for LLM customization and discover how this open-source toolkit can streamline your machine learning workflows.

Syllabus

SDG_Hub: An Open-Source Toolkit for Synthetic Data Generation & LLM Customization - DevConf.US 2025

Taught by

DevConf

Reviews

Start your review of SDG Hub - An Open-Source Toolkit for Synthetic Data Generation and LLM Customization

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.