Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Ready to explore the exciting world of generative AI and large language models (LLMs)? This IBM course, part of the Generative AI Engineering Essentials with LLMs Professional Certificate, gives you practical skills to harness AI to transform industries.
Designed for data scientists, ML engineers, and AI enthusiasts, you’ll learn to differentiate between various generative AI architectures and models, such as recurrent neural networks (RNNs), transformers, generative adversarial networks (GANs), variational autoencoders (VAEs), and diffusion models. You’ll also discover how LLMs, such as generative pretrained transformers (GPT) and bidirectional encoder representations from transformers (BERT), power real-world language tasks.
Get hands-on with tokenization techniques using NLTK, spaCy, and Hugging Face, and build efficient data pipelines with PyTorch data loaders to prepare models for training.
A basic understanding of Python, PyTorch, and familiarity with machine learning and neural networks are helpful but not mandatory. Enroll today and get ready to launch your journey into generative AI!
Syllabus
- Generative AI Architecture
- In this module, you will learn about the significance of generative AI and how it is transforming various fields through content generation, code creation, and image synthesis. You will explore key generative AI architectures, such as generative adversarial networks (GANs), variational autoencoders (VAEs), diffusion models, and transformers, and understand the differences in their training approaches. You’ll also examine how large language models (LLMs) like generative pretrained transformers (GPT) and bidirectional encoder representations from transformers (BERT) are applied in building NLP-based applications. Finally, through a hands-on lab, you will create a simple chatbot using the Hugging Face transformers library and get introduced to essential tools and libraries used in generative AI development.
- Data Preparation for LLMs
- In this module, you will learn how to prepare data for training large language models (LLMs) by implementing tokenization and building data loaders. You will explore different tokenization methods and understand how tokenizers convert raw text into model-ready input. You will implement tokenization using libraries such as NLTK, spaCy, BertTokenizer, and XLNetTokenizer. Additionally, you will learn the role of data loaders in the training pipeline and use the DataLoader class in PyTorch to create a data loader with a custom collate function that processes batches of text. These practical skills are essential for building efficient NLP pipelines for LLM training. In addition, supporting materials, such as a cheat sheet and glossary, will reinforce your learning.
Taught by
Joseph Santarcangelo and Roodra Pratap Kanwar