Google, IBM & Meta Certificates — 40% Off for a Limited Time
MIT Sloan AI Adoption: Build a Playbook That Drives Real Business ROI
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
In this 25-minute conference talk, learn how to deploy generative AI models directly on resource-constrained devices to ensure autonomy, security, and real-time performance. Explore systematic approaches for implementing Large Language Models (LLMs) and Transformers on autonomous vehicles, drones, and IoT devices without relying on cloud infrastructure. Examine state-of-the-art system-on-chips (SoCs) from leading manufacturers, understanding their capabilities and limitations for AI workloads. Discover essential model compression techniques including quantization, pruning, and knowledge distillation, with practical insights from a real-world case study showing how Small Language Models like Meta's Llama 3.2 can run efficiently on Qualcomm Snapdragon SoCs. Master the engineering techniques needed to evaluate hardware accelerators, apply compression methods without sacrificing model capabilities, balance model size with efficacy, and leverage emerging SLM trends to future-proof applications. Presented by Jonna Matthiesen, deep learning researcher at Embedl specializing in AI optimization for defense, automotive, and IoT applications, recorded at the 2025 GAIA Conference in Gothenburg, Sweden.
Syllabus
Deploying GenAI: Overcoming Challenges in Performance, Security, and Efficiency by Jonna Matthiesen
Taught by
GAIA