Start speaking a new language. It’s just 3 weeks away.
Our career paths help you become job ready faster
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the critical intersection of AI-generated code and open source license compliance in this 31-minute conference talk from the Linux Foundation. Delve into groundbreaking research that expands upon previous studies examining how large language models may generate code with significant similarity to their training data, potentially creating legal complications with incompatible software licenses. Learn about new findings that utilize STF's osskb.org service—a dataset 35 times larger than original studies—combined with SCANOSS open source scanner and the Winnowing algorithm to reveal similarity rates significantly higher than previously reported. Discover how this expanded reference base dramatically impacts detection rates and examine the effectiveness of the Winnowing algorithm as a preliminary indicator for code similarity. Engage with open questions about the implications of using AI coding assistants and participate in discussions about the broader challenges facing developers and organizations using LLM-generated code in open source environments.
Syllabus
LLM-generated Code and Open Source License Compliance: How Big Is the Problem? - Oscar Enrique Goñi
Taught by
Linux Foundation