Skip to Main Content
click map TCC Home TCC Library Home

Copyright Guide: Subscribe to News

A focus on copyright issues which may concern TCC faculty and staff -- including fair use, the TEACH Act, public domain and other copyright exceptions and issues. Nothing in this guide is to be construed as legal advice.

AI Training Data Dilemma: Legal Experts Argue For 'Fair Use' | Forbes

by Amanda Ross on 2024-10-24T12:21:39-05:00 | 0 Comments

Lemley posits further, "AI isn't competing with authors or artists. Instead, it is using their work in an entirely different manner. [...] ML systems generally copy works, not to get access to their creative expression (the part of the work the law protects), but to get access to the uncopyrightable parts of the work— the ideas, facts, and linguistic structure of the works." He proposes 'fair learning' as a principle that the use of copyrighted works to train ML systems should be fair even if fair use factors—the nature of the work, and the amount taken would otherwise weigh against fair use.

Take a language AI model trained on millions of books. It's not interested in the stories, characters, or themes; instead, it aims to learn linguistic patterns - things like grammar rules, sentence structures, and word relationships. Similarly, for an AI model to learn what a dog looks like, it needs to analyze millions of dog photos. The system isn't interested in the artistic composition or the specific dog in each photo - elements that might be protected by copyright. Instead, it's learning to recognize general features like fur, four legs, tails, and typical dog shapes. In fact, "verbatim copying" is the necessary intermediate step toward accessing the unprotectable "ideas and functional elements" of works that allow AI systems to learn generalizable patterns and concepts rather than simply memorizing specific content. AI models instead encode patterns from training data into parameters, generating responses using learned probabilities and not by referencing stored content.

...

As we navigate the complex landscape of AI and copyright law, a nuanced understanding is emerging. Legal scholars suggest two key points: the input data used for AI training may often be permitted under "fair use" or "fair learning." At the same time, purely machine-produced output is typically not copyrightable. This perspective recognizes that ML, at its core, is about extracting patterns and facts rather than copying creative expression. Ultimately, fair use is about more than transforming existing works. It's about preserving our collective ability to create, share, and build upon ideas. Or it's about preserving the ability to learn—whether the entity doing the learning is a human or a machine.

Read the rest 


 Add a Comment

0 Comments.

  Subscribe



Enter your e-mail address to receive notifications of new posts by e-mail.


  Archive



  Follow Us



  Facebook
  Instagram
  Return to Blog
This post is closed for further discussion.

  Metro Campus Library: 918.595.7172 | Northeast Campus Library: 918.595.7501 | Southeast Campus Library: 918.595.7701 | West Campus Library: 918.595.8010

email: Library Website Help  | MyTCC |  © 2024 Tulsa Community College