Segmentation Indexing and Retrieval of Multilingual Sources
Center for Language & Speech Processing(CLSP), JHU via YouTube
Build GenAI Apps from Scratch — UCSB PaCE Certificate Program
Get 20% off all career paths from fullstack to AI
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn advanced techniques for processing multilingual text data through this comprehensive lecture that explores segmentation, indexing, and retrieval methods for sources in multiple languages. Discover how to effectively break down multilingual documents into meaningful segments, create efficient indexing systems that can handle diverse linguistic structures, and implement retrieval algorithms that work across language boundaries. Examine the unique challenges posed by different writing systems, grammatical structures, and linguistic features when building cross-lingual information systems. Explore practical approaches to handling code-switching, mixed-language documents, and varying text directionality in multilingual corpora. Gain insights into evaluation metrics and methodologies for assessing the performance of multilingual segmentation and retrieval systems. Understand the computational and linguistic considerations involved in scaling these techniques to large, diverse multilingual datasets and learn about current research directions in cross-lingual natural language processing and information retrieval.
Syllabus
David Palmer: Segmentation Indexing and Retrieval of Multilingual Sources
Taught by
Center for Language & Speech Processing(CLSP), JHU