Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore an innovative approach to storing and managing tensor data in Delta Lake through this 15-minute conference talk. Learn about Delta Tensor, a method that streamlines data loading processes and improves storage efficiency for machine learning workflows. Discover how chunking techniques reduce IO costs for tensor slicing and how sparse encoding methods enhance storage efficiency for sparse tensors. Gain insights into creating an efficient storage and management solution within a cloud-native Lakehouse environment. Presented by Zhiyu Wu, a student from Northeastern University, this talk offers valuable knowledge for data engineers and machine learning practitioners working with tensor data in cloud-based systems.
Syllabus
Delta Tensor: Efficient Vector and Tensor Storage in Delta Lake
Taught by
Databricks