Overview
Syllabus
0:00 - Importance of Data in AI Applications
2:15 - Challenges in Handling Unstructured Data
3:47 - Data Cleaning and Sanitization
4:08 - Utilizing Delta Tables and Vector DBs
5:00 - Data Command Graph and Knowledge Graph
5:28 - Entitlement and Permission Management
6:02 - Integration with Unity Catalog
6:48 - Data Syncing & Integration Best Practices
13:25 - Live Demo: Secure Data Ingestion with Gen Core APIs
13:47 - Connecting On-Prem Data Systems SMB
14:14 - Configuring Data Loader and Sanitization Nodes
14:37 - Visualizing Data Flow and Provenance
15:02 - Previewing Original and Sanitized Files
15:26 - Viewing Files in Databricks
18:40 - Redaction and Anonymization Processes
19:25 - Encryption and Data Security
20:04 - Monitoring and Compliance Checks
21:50 - Utilizing MLflow for Model Performance Monitoring
22:38 - Syncing Data with Other AI Platforms e.g., Vortex AI, Azure
18:21 - Q&A Session
19:54 - Role of Unity Catalog in AI Governance
21:37 - Integration with External Rule Stores for Redaction
24:26 - Partnership with Databricks
Taught by
Data Science Dojo