Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Google

How to Use TPUs for Inference

Google via Google Skills

Overview

Build a Learning Habit
Download Class Central's free printable study calendar
Download for Free
This course is for developers interested in learning how to use TPUs for inference—from architecture to deployment, and how to solve common implementation challenges.

Syllabus

  • An introduction to TPUs
    • Introduction: How to use TPUs for inference
    • What are Cloud TPUs?
    • Google Cloud TPUs architecture
    • Reviewing the core TPU concepts
    • Consumption options
    • Quiz
  • Big decisions, deployment, and performance monitoring
    • Addressing common challenges
    • Use vLLM to increase throughput and reduce latency when serving large AI models
    • vLLM for LLM inference
    • vLLM demo
    • GKE scaling
    • Using profiling to understand your model's inference performance
    • Join our community
    • Quiz
  • Appendix
    • Reading List
  • Your Next Steps
    • Claim credential

Reviews

Start your review of How to Use TPUs for Inference

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.