Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

From Reliable Models to Resilient ML Platforms

Conf42 via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn how to transition machine learning models from development environments to production-ready, resilient platforms in this 25-minute conference talk. Explore the fundamental challenges of production ML including model drift, scaling issues, latency requirements, and availability concerns. Discover the advantages of modern cloud-native platforms over legacy systems, with IBM Cloud/SoftLayer serving as a practical infrastructure example. Master the essential pillars of resilient ML infrastructure including high availability and disaster recovery strategies. Implement security-by-design principles incorporating zero trust architecture and protection against DDoS attacks and ransomware threats. Understand how to sustain ML workloads through proper rate limiting, traffic spike management, and DDoS readiness protocols. Examine critical operational aspects including environment segmentation, isolation techniques, and secure model serving practices. Align frameworks with operational controls covering identity and access management, audit logging, and container image scanning. Establish performance metrics and resiliency benchmarking using service level objectives and agreements. Navigate the people and process considerations for cross-functional ownership in production ML environments. Compare deployment patterns across cloud-native, hybrid, and multi-cloud architectures. Gain practical design principles and key takeaways for building robust ML platforms that can reliably serve models at scale.

Syllabus

Welcome & Speaker Introduction Riva at Con 42 20 26
Talk Overview: Moving ML from Lab to Production + Agenda
Why Production ML Is Hard: Drift, Scale, Latency & Availability
Modern Platforms vs Legacy: Cloud-Native Capabilities
IBM Cloud/SoftLayer as an Example Infrastructure Foundation
Pillars of Resilient ML Infrastructure: HA & Disaster Recovery
Security by Design: Zero Trust, DDoS/Ransomware Protection
Sustaining ML Workloads: Rate Limits, Traffic Spikes & DDoS Readiness
Segmentation, Environment Isolation & Secure Model Serving
Framework Alignment & Operational Controls: IAM, Audit Logs, Image Scanning
Performance Metrics & Resiliency Benchmarking SLOs/SLAs
People & Process: Cross-Functional Ownership for Production ML
Deployment Patterns: Cloud-Native vs Hybrid vs Multi-Cloud
Design Principles & Key Takeaways + Closing/Q&A

Taught by

Conf42

Reviews

Start your review of From Reliable Models to Resilient ML Platforms

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.