Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

High-Scale Networking for ML Workloads With Cilium

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This conference talk explores how G-Research implements Cilium for networking in their massive machine learning environment spanning over 10,000 nodes. Discover how they utilize Cilium as the core networking solution for on-premise, bare-metal clusters that scale up to 1,000 nodes each. Learn about critical Cilium features including network policy implementation for enforcing strict security controls that protect market-sensitive information, host firewall capabilities that eliminate the need for external firewall appliances, and the high-performance eBPF dataplane that directly enhances ML job performance. The presentation also covers advanced topics such as limiting Cilium's identity labels to reduce policy map pressure, tuning conntrack garbage collection, and understanding the performance implications of different policies at scale. Gain practical knowledge about using Cilium's built-in tools to observe and measure large deployments, and learn what to watch for when managing large Kubernetes clusters.

Syllabus

High-Scale Networking for ML Workloads With Cilium - Luigi Zhou, G-Research

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of High-Scale Networking for ML Workloads With Cilium

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.