Overview

In this Google DeepMind course, you will learn the fundamentals of language models and gain a high-level of machine learning development pipelines.

In this Google DeepMind course, you will learn the fundamentals of language models and gain a high-level understanding of the machine learning development pipeline. You will consider the strengths and limitations of traditional n-gram models and advanced transformer models. Practical coding labs will enable you to develop insights into how machine learning models work and how they can be used to generate text and identify patterns in language. Through real-world case studies, you will build an understanding around how research engineers operate. Drawing on these insights you will identify problems that you wish to tackle in your own community and consider how to leverage the power of machine learning responsibly to address these problems within a global and local context.

Syllabus

Introduction to the language modeling problem

In this module, you will explore the power of language models and their real-world applications. Starting with a manual method for modelling language, you will investigate the role that probabilities and randomness play in next word prediction. You will also consider the course learning objectives and how to most effectively study.

From n-grams to transformers

In this module, you will move beyond the manual method and explore how n-grams can be used to tokenize data. You will investigate how probabilities can be calculated to begin identifying language patterns. You will then build your own n-gram model using a small dataset and examine its limitations. Furthermore, you will consider the process researchers undertake when approaching real-world problems through the lens of Google DeepMind’s AlphaFold project. Finally, you will reflect on your own values and those of your community, as well as the role AI systems play in making decisions that involve ethical choices.

Transformer models

In this module, you will experiment with more sophisticated transformer models and evaluate how they perform in comparison to n-gram models. You will take a deeper dive into the anatomy of language models and their core components. You will continue reflecting on the role that values play in guiding which technical problems you choose to solve. Specifically, you will consider the Ubuntu moral system and compare its characteristics with moral values popular in Europe and North America. Finally, you will design a values framework for guiding LLM development in your local community.

Training a model

In this module, you will contextualise the process of building language models within the machine learning development pipeline. You will preprocess your dataset and learn how to prepare a dataset to be used for training a transformer model. You will then train your own language model and evaluate its performance.

Challenge

In this module, you will consider the specific benefits that transformer LLMs can bring about for different sectors in your local context. You will then explore what makes a good problem statement before developing your own problem statement for a challenge around language models that you have identified in your community.

Continue your journey

In this module, you will have the opportunity to consult additional resources and further reading to investigate the topics you have covered in more detail. Finally, you will consider your next steps and how you can build on what you have learned in the course.