this post was submitted on 25 Jun 2025
176 points (97.8% liked)
Technology
71885 readers
5067 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
IMO the focus should have always been on the potential for AI to produce copyright-violating output, not on the method of training.
If you try to sell "the new adventures of Doctor Strange, Jonathan Strange and Magic Man." existing copyright laws are sufficient and will stop it. Really, training should be regulated by the same laws as reading. If they can get the material through legitimate means it should be fine, but pulling data that is not freely accessible should be theft, as it is already.
I have a freely accessible document that I have a cc license for that states it is not to be used for commercial use. This is commercial use. Your policy would allow for that document to be used though since it is accessible. This kind of policy discourages me from easily sharing my works as others profit from my efforts and my works are more likely to be attributed to a corporate beast I want nothing to do with then to me.
I'm all for copyright reform and simpler copyright law, but these companies need to be held to standard copyright rules and not just made up modifications. I'm convinced a perfectly decent LLM could be built without violating copyrights.
I'd also be ok sharing works with a not for profit open source LLM and I think others might as well.
Copies of copyrighted works cannot be regarded as "stolen property" for the purposes of a prosecution under the National Stolen Property Act of 1934.
https://en.m.wikipedia.org/wiki/Dowling_v.United_States(1985)
That "freely" there really does a lot of hard work.
It means what it means, "freely" pulls its own weight. I didn't say "readily" accessible. Torrents could be viewed as "readily" accessible but it couldn't be viewed as "freely" accessible because at the very least you bear the guilt of theft. Library books are "freely" accessible, and if somehow the training involved checking out books and returning them digitally, it should be fine. If it is free to read into neurons it is free to read into neural systems. If payment for reading is expected then it isn't free.
Civil cases of copyright infringment are not theft, no matter what the MPIA have trained you to believe.
But they are copyright infringement, which costs more than theft.
Plantifs made that argument and the judge shoots it down pretty hard. That competition isn't what copyright protects from. He makes an analogy with teachers teaching children to write fiction: they are using existing fantasy to create MANY more competitors on the fiction market. Could an author use copyright to challenge that use?
Would love to hear your thoughts on the ruling itself (it's linked by reuters).