35% Off Finance Skills That Get You Hired - Code CFI35
Learn EDR Internals: Research & Development From The Masters
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore advanced techniques for extracting insights from complex PDF documents containing text, tables, and images in this 33-minute conference talk from the Linux Foundation's Open Source Summit. Learn two powerful approaches to handle multimodal PDF content: building specialized pipelines that integrate OCR and machine learning models for processing diverse data types, and utilizing cutting-edge Vision-Language Models like ColPali to represent multimodal information in a unified format. Discover how to implement these methods using OpenSearch's robust search and ingest pipelines to create intelligent conversational search applications with open-source technology. Watch a live demonstration showcasing practical implementations that will help you determine which approach best suits your specific requirements for processing unstructured PDF documents and unlocking their hidden insights.
Syllabus
Unlocking Insights From Multimodal PDFs Using OpenSearch and V... Mingshi Liu & Praveen Mohan Prasad
Taught by
Linux Foundation