Subscribe to News - Copyright Guide - LibGuides at Tulsa Community College

AI Training Data Dilemma: Legal Experts Argue For 'Fair Use' | Forbes

by Amanda Ross on 2024-10-24T12:21:39-05:00 | 0 Comments

Lemley posits further, "AI isn't competing with authors or artists. Instead, it is using their work in an entirely different manner. [...] ML systems generally copy works, not to get access to their creative expression (the part of the work the law protects), but to get access to the uncopyrightable parts of the work— the ideas, facts, and linguistic structure of the works." He proposes 'fair learning' as a principle that the use of copyrighted works to train ML systems should be fair even if fair use factors—the nature of the work, and the amount taken would otherwise weigh against fair use.

Take a language AI model trained on millions of books. It's not interested in the stories, characters, or themes; instead, it aims to learn linguistic patterns - things like grammar rules, sentence structures, and word relationships. Similarly, for an AI model to learn what a dog looks like, it needs to analyze millions of dog photos. The system isn't interested in the artistic composition or the specific dog in each photo - elements that might be protected by copyright. Instead, it's learning to recognize general features like fur, four legs, tails, and typical dog shapes. In fact, "verbatim copying" is the necessary intermediate step toward accessing the unprotectable "ideas and functional elements" of works that allow AI systems to learn generalizable patterns and concepts rather than simply memorizing specific content. AI models instead encode patterns from training data into parameters, generating responses using learned probabilities and not by referencing stored content.

...

As we navigate the complex landscape of AI and copyright law, a nuanced understanding is emerging. Legal scholars suggest two key points: the input data used for AI training may often be permitted under "fair use" or "fair learning." At the same time, purely machine-produced output is typically not copyrightable. This perspective recognizes that ML, at its core, is about extracting patterns and facts rather than copying creative expression. Ultimately, fair use is about more than transforming existing works. It's about preserving our collective ability to create, share, and build upon ideas. Or it's about preserving the ability to learn—whether the entity doing the learning is a human or a machine.

Read the rest

Add a Comment

0 Comments.

Search this Blog

Subjects

OER - Open Educational Resources

Return to Blog

Copyright Guide: Subscribe to News

AI Training Data Dilemma: Legal Experts Argue For 'Fair Use' | Forbes

0 Comments.

Search this Blog

Recent Posts

Archive

Subjects

This post is closed for further discussion.

Copyright Guide: Subscribe to News

AI Training Data Dilemma: Legal Experts Argue For 'Fair Use' | Forbes

0 Comments.

Search this Blog

Recent Posts

Subscribe

Archive

Subjects

Follow Us

This post is closed for further discussion.