From Python to Silicon - Building Efficient Inference with Siemens

Explore a practical workshop that demonstrates the complete journey from Python machine learning models to custom silicon implementations for edge AI applications. Learn how to optimize inference performance through high-level synthesis (HLS) techniques while balancing power consumption, latency, and accuracy requirements. Discover the compute landscape spanning CPUs, GPUs, TPUs/NPUs, and custom FPGA/ASIC designs, understanding when each architecture provides optimal power-performance-area benefits. Master quantization techniques to optimize bit-widths, apply loop pipelining and unrolling strategies to maximize throughput, and implement memory partitioning with streaming between layers to eliminate costly data transfers. Gain hands-on experience with Siemens EDA tools including Catapult HLS, Questa, and PowerPro to create feedback loops for latency, area, and power optimization. Participate in the Efficient Inferencing Hackathon using a ready-to-run RISC-V Rocket Core baseline for MNIST classification, complete with virtual machine access, C kernels for convolution and dense layers, and a structured pathway from Keras models to synthesizable RTL. Work toward delivering the fastest MNIST implementation that meets strict accuracy, area, and energy targets while receiving guidance from the HLS Academy community of experts and peers.