Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about Jaccard Distance, k-grams modeling techniques, and minHashing algorithms in this comprehensive data science lecture. Explore the mathematical foundations of Jaccard similarity and distance measures for comparing sets and documents. Understand how to implement k-gram models for text analysis and natural language processing applications. Discover minHashing as an efficient technique for approximating Jaccard similarity in large-scale data processing scenarios. Master the practical applications of these methods in document similarity, recommendation systems, and clustering algorithms. Gain hands-on experience with computational approaches to measuring similarity between datasets and implementing scalable solutions for big data environments.
Syllabus
L6-Jaccard
Taught by
UofU Data Science