When Agents Learn to Feel - Multi-Modal Affective Computing in Production

Explore the cutting-edge integration of multi-modal affective computing in AI agents through this 21-minute conference talk that examines how next-generation AI systems can sense and respond to human emotions in real-time production environments. Learn how voice, text, facial expressions, and interaction patterns combine to detect emotional states as the speaker shares insights from MIT Media Lab and Harvard research, plus startup experience building emotion-aware AI tutors. Discover the technical challenges of moving affective sensing from laboratory settings to production systems, including architectures that merge ensemble LLMs with sensor inputs, strategies for handling conflicting modalities, and essential privacy guardrails for sensitive applications like education. Understand multi-agent orchestration patterns such as critic-rewriter loops and role-based ensembles that enable personalized instruction and equitable feedback across diverse learners. Gain practical knowledge about the architectures, common pitfalls, and key metrics required to deploy multi-modal, affect-sensing agents in durable production systems, while considering broader applications beyond education where AI agents must demonstrate emotional intelligence and empathy in human-centered interactions.