AI System Validation - Meta Perspective: The Challenges of Debugging at Scale
Open Compute Project via YouTube
NY State-Licensed Certificates in Design, Coding & AI — Online
Lead AI Strategy with UCSB's Agentic AI Program — Microsoft Certified
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about the challenges and approaches to debugging at scale in this 17-minute talk by Carlos Fernandez, an AI System Validation Engineer at Meta. Explore the complexities of debugging highly interconnected systems, managing the risks of constant updates in a fast-paced development environment, and addressing diverse use cases across AI, Storage, and Compute. Discover Meta's effective strategies including real-time fleet monitoring with alert-triggered repairs, automated testing frameworks like formal Firmware Qualification to catch issues early, and cross-functional collaborative debugging that leverages diverse expertise to solve complex problems.
Syllabus
AI System Validation- Meta Perspective
Taught by
Open Compute Project