Learn Python with Generative AI - Self Paced Online
Get 20% off all career paths from fullstack to AI
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about the challenges and insights from conducting massive-scale code scanning operations in this conference talk by Philippe Ombredanne from AboutCode. Discover how AboutCode is performing comprehensive license scans across the entire Software Heritage archive, analyzing over 20 billion unique source code files from more than 300 million projects, plus indexing all major package registries and Linux distributions. Explore the critical importance of accurate software origin and license detection for compliance in today's landscape of code-generating LLMs and evolving governmental regulations like the European CRA. Understand how this massive scanning effort creates a commons reference database of open data about code, enabling faster scanning and matching processes with precise license information and extensive fingerprint collections for approximate code matching at scale. Gain insights into the technical caveats and practical lessons learned from code analysis at unprecedented scale, and learn how to leverage this open, accurate reference data to implement best practices for more efficient license and security compliance automation.
Syllabus
Lessons Learned From Code Scanning at Scale - Philippe Ombredanne, AboutCode
Taught by
OpenSSF