High-Scale Networking for ML Workloads With Cilium
CNCF [Cloud Native Computing Foundation] via YouTube
The Private Equity Associate Certification
Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
This conference talk explores how G-Research implements Cilium for networking in their massive machine learning environment spanning over 10,000 nodes. Discover how they utilize Cilium as the core networking solution for on-premise, bare-metal clusters that scale up to 1,000 nodes each. Learn about critical Cilium features including network policy implementation for enforcing strict security controls that protect market-sensitive information, host firewall capabilities that eliminate the need for external firewall appliances, and the high-performance eBPF dataplane that directly enhances ML job performance. The presentation also covers advanced topics such as limiting Cilium's identity labels to reduce policy map pressure, tuning conntrack garbage collection, and understanding the performance implications of different policies at scale. Gain practical knowledge about using Cilium's built-in tools to observe and measure large deployments, and learn what to watch for when managing large Kubernetes clusters.
Syllabus
High-Scale Networking for ML Workloads With Cilium - Luigi Zhou, G-Research
Taught by
CNCF [Cloud Native Computing Foundation]