Learn Generative AI, Prompt Engineering, and LLMs for Free
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the critical intersection of AI-generated code and open source license compliance in this 31-minute conference talk from the Linux Foundation. Delve into groundbreaking research that expands upon previous studies examining how large language models may generate code with significant similarity to their training data, potentially creating legal complications with incompatible software licenses. Learn about new findings that utilize STF's osskb.org service—a dataset 35 times larger than original studies—combined with SCANOSS open source scanner and the Winnowing algorithm to reveal similarity rates significantly higher than previously reported. Discover how this expanded reference base dramatically impacts detection rates and examine the effectiveness of the Winnowing algorithm as a preliminary indicator for code similarity. Engage with open questions about the implications of using AI coding assistants and participate in discussions about the broader challenges facing developers and organizations using LLM-generated code in open source environments.
Syllabus
LLM-generated Code and Open Source License Compliance: How Big Is the Problem? - Oscar Enrique Goñi
Taught by
Linux Foundation