Overview

In this Google DeepMind course you will discover the mechanisms of the transformer architecture.

In this Google DeepMind course you will discover the mechanisms of the transformer architecture. You will investigate how transformer language models process prompts to make context-sensitive next-token predictions. Through practical activities you will explore the attention mechanism, visualize attention weights, and encounter advanced concepts like masked attention and multi-head attention. You will also learn other techniques that are necessary to build neural networks that are well-suited to be used as language models. Finally, through activities on values, stakeholder mapping and community engagement, you will practice concrete tools for ensuring AI projects are developed with communities, not just for them.

Syllabus

Introduction

In this module, you will reflect on which tokens in a prompt have the biggest impact on the prediction of the next token. You will also visualize the attention weights of the Gemma model to see which tokens the model relies on when making predictions. Finally, you will explore how community values and perspectives shape the meaning and impact of AI technologies.

The attention mechanism

In this module, you will implement the attention mechanism. You will learn how this mechanism is used to combine the information from individual tokens to create embeddings that represent the information of an entire prompt. You will also reflect on how everyday human interactions create shared meaning and reinforce values, such as community, belonging, and respect. Further, you will consider what may be lost when these practices are replaced by automated systems.

Assembling a transformer

In this module, you will learn about the other components that are required for building a transformer model. You will investigate the importance of adding positional information to tokens and you will see what components a transformer block consists of. You will also explore the role multi-layer perceptrons and normalization play in the transformer block. Finally, you will walk through a complete implementation of a transformer language model and investigate the parameters that are part of each component.

Reflection and practice

In this module, you will learn about the advantages and disadvantages of using a transformer model and discover sophisticated methods for generating texts with language models. Additionally, you will consider how technologies like chatbots are understood differently by different groups, revealing why meaningful engagement is essential to avoid reinforcing stereotypes, deepening inequalities, or overlooking social values. You will see how, by recognising diverse perspectives, developers can design AI that is more inclusive, fair, and responsive to community needs.

Challenge

In this module, the stakeholder mapping and social values activity will help you identify who is affected by your project, what values matter to them, and how their influence shapes outcomes. This will be followed by a mini-engagement design which will guide you to plan simple, practical ways of involving these groups so their perspectives meaningfully shape your AI project.

Continue your journey

In this module, you will have the opportunity to consult additional resources and further reading to investigate the topics you have covered in more detail. Finally, you will consider your next steps and how you can build on what you have learned in the course.