Building LLM-Based Apps with Llama 3 at 1,000 Tokens per Second on the SambaNova AI Platform
AI Engineer via YouTube
The Most Addictive Python and SQL Courses
NY State-Licensed Certificates in Design, Coding & AI — Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn to build high-performance LLM applications using Llama-3 at 1,000 tokens per second on SambaNova's AI platform in this intermediate-level workshop. Discover SambaNova's full-stack generative AI platform powered by the SN40L AI chip and explore Samba-1, a trillion parameter composition of experts model designed for enterprise applications. Build and deploy an end-to-end question-answering system with retrieval augmented generation (RAG) for enterprise search using a comprehensive technology stack including LangChain framework, Unstructured for text preprocessing, E5-large-v2 embeddings, ChromaDB vector store, and Llama-3-8B-Instruct model. Gain hands-on experience through practical exercises with provided GitHub repositories, step-by-step Jupyter notebooks, and Streamlit applications while utilizing SambaNova API keys for both CoE and Llama-3 endpoints. Designed for tech professionals and engineers interested in enterprise generative AI applications, this workshop requires programming experience (preferably Python), a GitHub account, and a laptop for the practical components.
Syllabus
[Full Workshop] Llama 3 at 1,000 tok/s on the SambaNova AI Platform
Taught by
AI Engineer