Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about the challenges and insights from conducting massive-scale code scanning operations in this conference talk by Philippe Ombredanne from AboutCode. Discover how AboutCode is performing comprehensive license scans across the entire Software Heritage archive, analyzing over 20 billion unique source code files from more than 300 million projects, plus indexing all major package registries and Linux distributions. Explore the critical importance of accurate software origin and license detection for compliance in today's landscape of code-generating LLMs and evolving governmental regulations like the European CRA. Understand how this massive scanning effort creates a commons reference database of open data about code, enabling faster scanning and matching processes with precise license information and extensive fingerprint collections for approximate code matching at scale. Gain insights into the technical caveats and practical lessons learned from code analysis at unprecedented scale, and learn how to leverage this open, accurate reference data to implement best practices for more efficient license and security compliance automation.
Syllabus
Lessons Learned From Code Scanning at Scale - Philippe Ombredanne, AboutCode
Taught by
OpenSSF