LLM Efficient Inference in CPUs and Intel GPUs - Intel Neural Speed
The Machine Learning Engineer via YouTube
You’re only 3 weeks away from a new language
Master AI and Machine Learning: From Neural Networks to Applications
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore efficient inference techniques for Large Language Models (LLMs) on CPUs and Intel GPUs using Intel Neural Speed in this 30-minute video. Dive into the performance capabilities of Intel Extension for Transformers and gain practical insights through provided Jupyter notebooks. Learn how to optimize LLM inference for data science and machine learning applications, leveraging Intel's hardware-specific solutions. Access accompanying resources, including a Medium article and GitHub repositories, to deepen your understanding and implement the techniques discussed.
Syllabus
LLM Efficient Inference In CPUs and Intel GPUs. Intel Neural Speed #datascience #machinelearning
Taught by
The Machine Learning Engineer