https://startupwired.com/2025/09/20/anthropic-vs-authors-the-1-5-billion-copyright-showdown/
"In June 2025, Judge Alsup made a critical ruling. He drew a sharp line between legal and illegal data use. He declared that training AI models on books purchased or otherwise legally obtained could fall under fair use. However, downloading pirated books from sites like LibGen and PiLiMi and storing them in bulk did not qualify as fair use"
Why artists think that someone owed them compensation from saving image from social media or other site on open internet? Saved image from open internet is legally obtained, saved paywalled image after paying for access is legally obtained. Its "stealing" when someone makes money directly from your image, AI training is not doing that.
Copyright protect end result, not the process, concepts, ideas.
To answer your specific question, most people (and artists in general) typically have poor understanding of copyright. Both in terms of legal implications and how the current system benefits society.
Pro ai people aren’t immune to it either. Witness all the arguments about why wholly ai generated work would qualify for copyright
Wholly AI generated work qualifies for copyright in the US as long as you inpaint it, demonstrating a personal decision that you wanted to make a change, and did. The literal image is not copyrighted, but the choices you made to change it are copyrighted.
However, this ends up being roughly equivalent to full protection, because anyone who might want to infringe upon your AI work doesn't know what parts of the image are raw gen and what parts represent a change you made, so it's unsafe for them to use any of it.
Or any person commenting on why human user using AI can’t have copyright protection. They all show up as ignorant on copyright. Including USCO (for the moment).
cause many people seem to think that things done specifically with intention is unable to be copyright. spoiler it can be copyright.
what cant be copyright is if you just run image gen with no purpose and have a bunch of ai do stuff then upload. like youtube video uploads for example where they prompt it and do nothing for 30 days while it throws up 30 videos on their channel. they can not in fact be copyright.
if i do things specifically for my ip with specific intention i own it all even if its AI generated. cause its specific. if i ran an ai walked away then put them up those would not be copyright i never did them for any specific reason etc.
https://i.redd.it/6t6gpp68z66g1.gif
Oh yeah it's just another in a long list of anti arguments that are based purely on their personal opinion and in no way on fact.
It isn't stealing, courts have said so. As citizens they have the right to pursue a lawsuit to claim otherwise but the likely end result is they as the plaintiff fail to provide new evidence showing it is stealing and the precedent is incorrect. Then they pay your legal fees and a small charge for the harms the lawsuit brought you.
There is no such thing as "LOOKING-right", only copyright.
AI looks, and learns from looking, it then forgets and cannot copy.
Did you pay Picasso's family a royalty when you first LOOKED at his artwork in school and learned from it, but didn't copy it? No. You didn't.
Congrats, you are the same as AI in culpabililty or not.
It's still not been fully decided and there are plenty of ongoing cases. Just because it "could" fall under fair use doesn't mean it does. That case focused on the piracy aspect so probably didn't venture too far in that direction and Anthropic are trying to settle, likely to avoid venturing any further into what is fair use.
With your example, under current law, saving an image from the internet for your own use is fine because you have been granted access to see it. But to use that image on your own website or turn that image into a new product is a different use and you'd be expected to license it. AI training is just another way of turning a very large collection of content into a new product.
If you saved that image and used it as reference for your own artwork, learned techniques of composition and incorporate them into your own works, is that using it to create a new product under your definition?
What if after I learned the rules I delete the image from my memory so I don't even have the image to call back to, just the rules I learned from looking at it? Am I still using your work to create a new product?
You are a human (I assume.) The model is not, it's a commercial product. The source image is being used directly to create the model, so that is arguably commercial use.
But is the commercial product doing anything but learning rules of composition? What changes here?
How is demanding compensation for that learning different from wanting to own a style?
Its not owning a style, it's typical licensing for commercial use.
A style?
For using someone else's work. If they made their own source images in a similar style there would be no issue.
Using someone's work to do what? Learn styles, learn rules of composition? So the artists want compensation not for the works themselves but for the style?
It can cause the deepfake situation. Let says a model (canbe LoRA) training on a specific artist, like Picasso. It will generate pictures with Picasso's elements or even worse, claiming Picasso did it. If Picasso was still alive, he could sue for 'deception to buyer', or with better wording, 'making the similarity too great and cause damage to his own works and business'. For collectors, they do want the genuine articles (let says, the Van Gogh painting that was confiscated by the n@z1 somehow appeared in the house of a filipino president) vs a machine generated image based on an artist's works.
That's just the art side. Then the whole deepfake catphishing scam people, or outright using deepfake to spread misinformation or even harming, now that's also a problem.
Are there not already legal remedies in place well before AI to protect the original artists work from people doing such things?
That AI is a tool that makes something we could always do, (copy someone's style to the point selling that work could be considered deception to buyer), now easier to do, that doesn't mean laws need changed ... The laws are already there to allow someone to sue if works produced too closely resemble and hurt ones brand.
So I don't see how AI as a tool is any different than a copy machine in the way content creators are impacted by its invention. There are already laws in place to protect content creators and their IP and AI didn't change those laws.
If somebody made new images specifically to train on in the same style there would be no problem. The issue is they arent. They are using people's work directly for training the model when the model is just another commercial product so it should be treated as commercial use. If you use someone else's work in your own product it would usually need a commercial license.
But what they are using the work for is merely learning style.
So we are back to the problem.
The work isn't actually being used outside of being learned from, and that sort of learning has always been fair use.
>is the commercial product doing anything but learning rules of composition?
Yes, such as it is sometimes memorizing and replicating entire works.
No, though it may be able to reproduce a work under certain contexts the model itself does not store a copy of any image just the rules that made the image what it is.
That it can memorize those rules so well that it may be able to produce an exact copy, well, I mean copy machines already exist so I don't see the fuss.
"model itself does not store a copy of any image"
What a garbage reason to refuse to accept it plagarising.
It's just not plagiarism to memorize styles.
And again copy machines exist.
You are essentially saying that generative AI is just a complicated and often faulty copy machine, that can sometimes make copies of others works...
And I still don't see how that would be a big deal, seeing as how copy machines and cntrl+c already exist, and there are already laws to protect ppl of someone uses those tools to reproduce someone's work.
Nor do I see how it's plagiarism to memorize rules of composition by analyzing images.
"You are essentially saying that generative AI is just a complicated and often faulty copy machine"
Not at all. That's made up bs from you. Predictable and boring, yawn.
You asked:
>is the commercial product doing anything but learning rules of composition?
I answered with facts straight from reality: it memorized and reproduced entire images. This is a real event that did happen, despite your feelings on how ai works. Copying an image is a thing we witnessed it do. That's different from learning rules of composition.
Its not really a big deal no. But apparently it is to you if you have some huge issue with accepting facts and need to pretend like this didn't happen.
Though I disagree with you that what you described actually occurred, even if we say it did...
Is the complaint from trad artists really that we've invented a new type of copy machine?
Again there are already protections for using someone's IP and generative AI didn't change those laws.
So, if copy machines and ctrl +c exist already, and we are saying that sometimes Gen AI can act like a copy machine... what are traditional artists complaining about again?
Okay but even as someone who agrees that it is more complex than just saving images you can access being fine depending on licensing (though I'm on the fence about certain issues, like what about creative commons or other open licenses? Should the artist still have a say then? I'm like maybe but idk.)
It is very analogous to human memory. Time and volume of information (mostly the latter) makes it harder and harder to isolate the neuron (the memory) storing the pattern data of a given image unless that image is repeatedly reinforced. The more other information is accumulated the more specific you need to be to access that particular neuron. Hell, as someone who uses it to recolor things I can tell you it really often doesn't like to use an input image to generate a modified output image. It can do it but it pretty often generates a new bizarro image instead that looks like someone was told to look at an image for 30 seconds, watch a movie for 30 minutes, and then recreate the original image from memory. If that's what it's doing with a direct source image stored in memory then unless there's enough representation in the data, low data, or extreme specificity then you're not really going to get anything more than its vague pattern data as an influence.
All that being said, I absolutely think that at the very least unless it uses an open license or something (though I'm kinda on the fence then) you should have to get permission from the artist. I feel less strongly the further rights holders get away from the artist themselves though. Family or close friends are fine but idgaf about corporations unless they directly represent the author. Obviously don't care about speculators either. Even estates can get a bit tricky for me. Like personally I'm not gonna cry over the Doyle estate being upset that Sholmes is being used for training just like I'm not going to be mad that Jack Kirby's run of Spider-Man is being used for training.
>unless there's enough representation in the data, low data, or extreme specificity then you're not really going to get anything more than its vague pattern data
There was enough and we did get copies. You sometimes not getting a good copy with img2img is another terrible reason to refuse to accept how AI models tend to memorize and replicate images.
Id love to see if you have any actual examples from the human side if you feel there is an analogy.
The problem is that you're saying that any use of a copyrighted image in a commercial capacity is copyright infringement, and that's just straight up wrong.
You're basically just using the word 'use' as a catch-all for any possible thing that can be done with a copyrighted work and claiming that literally anything is infringing if used in a commercial manner.
That's just not how copyright works though. Copyright is largely there to prevent distribution of copyrighted works, not to prevent you from doing anything with them.
Commercial use is any possible use when used as part of a business or to make profit. The more I look it up I can only find evidence to support my understanding e.g. https://www.cobrief.app/resources/legal-glossary/commercial-use-overview-definition-and-example/
If you use someone's product for commercial purposes, it does not matter if you delete it afterwards. If I take an online course for PowerBI or something and then delete it after I'm done and just have "the rules I learned from looking at it", it doesn't mean I never needed to pay for it. If I use PowerBI for commercial purposes and then uninstall it, it doesn't mean I never needed a commercial license.
"use that image on your own website or turn that image into a new product is a different use and you'd be expected to license it" Its still direct usage of image. Generalyzed concepts and patterns are not protected.
The model does use the image directly in training.
That's not what "use" means in terms of fair use.
Your use of a work is contingent on its appearance in the final product you share publicly.
If you're making a video game and you use Final Fantasy sprites as stand-ins while you're still working on the game because you don't have a real artist yet, and then later you replace those stand-ins with your own original art and finally release your game, your game cannot be said to "use" the Final Fantasy sprites. They're not present in the final product. Square-Enix can't sue you for releasing a game that doesn't have their art in it anywhere.
(This actually happened with a popular game: during development, Terraria used Final Fantasy sprites temporarily. And the final game isn't considered copyright infringement.)
As long as whatever you're cooking up behind closed doors isn't public, then you haven't released an infringing product.
And finished AI models do not contain the works they were trained on, so the training process is not considered "use."
I would disagree, if it's dependent on something for the final product, it's being used. You can take away the FF sprites from the process without consequence if rebuilding the final release of Terraria, but if you take away all of the copyrighted content when rebuilding or retraining the models they would not function.
Sorry, the only opinion which matters here is the court's.
No you can't, because the development process would've been slightly different if they hadn't used those sprites along the way. Maybe the final sprites would've been taller, or more/less animated, or ended up looking a lot different. They were absolutely used in the development of the game and had a minor influence on it.
Likewise, as an artist you are allowed to pin other works to your canvas to look at for reference or sample colors from while you draw something. That doesn't taint the final work as an infringing piece. And again, in this case you cannot take the use of those works out of the creation of the piece, if they were responsible for precise colors you used, or your ability to draw a specific character pose etc.
The courts are still deciding so it's still open to debate. The one mentioned above is not the only case and they have gone both ways.
The sprites might have been influential on development, but if you take the source code as it stands now without them and rebuild the game from source, it will not need those sprites to function. You can't say the same for the models.
No, it's really not on this point of copyright.
Because then you would have to say things like, well this person's book clearly does not infringe upon Lord of the Rings, but we know they watched the movies which means those concepts were floating around in there and contributed to their writing, which means their book DOES infringe, since they wouldn't have written it precisely that way without having seen the movie first.
And the case of the artist who used the eyedropper tool on someone else's art pinned to their canvas, or referenced a specific pose?
There are still ongoing court cases as well as some others that have already judged that AI training can be infringing. There is no definitive legal answer yet.
Copyright covers fixed works not concepts so your book is probably safe unless it uses characters or anything that would make it a derivative work.
But the source content for training AI is fixed works and the model is being built off them directly for commercial use without permission so I think they would fall under copyright protection.
Not really sure what point you're trying to make here? If you mean its only taking certain elements of works, that might be true, but it is processing those works during the build / training process to extract those elements and the final dataset is a summary of everything that went in. They aren't just describing things to the model in an abstract way, they are feeding it the original source material to process and assimilate.
And after training?
The conversation is about training.
So training is commercial use? Model could be used in commercial way after training but its two different things.
Yes, that's how I see it. They sell access to the model, it's a commercial product.
Selling access to thing and selling thing are two different things. No?
Not really. You're still making money from it. It's still a commercial product. Netflix still has to license the movies it streams, it doesn't just buy one copy.
Netflix reselling movies it streams, thats why they need licence. How AI is reselling images?
Humans use direct images, texts, tunes etc. in their training; it doesn't render their work invalid unless there is blatant plagiarism.
You will have been licensed the stuff for your own consumption, even if it's free to access. Unless you're directly putting it into a new product, it's not the same. The source images are used directly in training the model ready to be sold on. It's commercial use.
So? Training is fine. The source images can't be plagiarized, of course. However, images inspired by them are legal absent plagiarism.
The model is the commercial product. I'm not talking about what it generates.
No legal violation.
That's undecided
It was decided in Judge Alsup's decision; though of course higher courts may ultimately decide differently.
It doesn't NEED to fall under fair use, because it just has nothing to do with copyright. It's not COPYING anything.
Does tying my shoe lace need to fall under a fair use bullet point? No, because it's not copying, so copyright rules are just completely irrelevant. Tying my shoelace fits none of them and yet is 100% legal, because it doesn't involve copying any artwork. Neither does AI.
Just simply off topic / frivolous lawsuits. It's like suing AI companies for breaking your front yard fence when they've never even visited your state.
AI is trained on a lot of content pulled from the internet without permission. Using copyrighted content in your own product would usually need a license and they use the original images and content in the training process. Copyright protects against work being used without permission. It is the main protection in law for intellectual property so it's entirely relevant here.
You don't NEED "permission" to look at anything on the internet. Only to copy it. So, so what if it didn't have "permission" that it never needed to begin with?
You can "use" copyrighted stuff in any way you want with zero restrictions whatsoever, for any "use" other than copying it.
Copyright only pertains to copying. No other use. Not looking at. Not thinking about. Not learning from. Not using it as toilet paper. Not in a box, not with a fox.
Only copying.
AI doesn't copy it. So copyright doesn't apply, the end.
No, it doesn't. It protects against copying and ONLY copying. not "using in any way" if that doesn't involve copying.
Not if you're a business and your intent is for commercial use. That usually needs a separate license and is not covered just by you having personal access to see something.
That's debatable. With the way stable diffusion works, it needs to be able to recreate an image. Being able to prompt for and successfully get images of known characters means it knows enough of the source content to reproduce it. The lyrics court case went against AI companies because the models knew lyrics to songs.
Yes, if you're a business you can do ANYTHING other than copying. Because there's simply no law saying you can't.
So can an individual.
So can a government.
So can a charity.
Anyone can do anything that is not-copying, with any artwork they can see, at any time, because nothing says they can't.
Cite what law you think restricts anything other than copying. Be specific, quote the exact phrase in the law that you think says anything about non-copying activities. There isn't any such law (in the US, where OpenAI is), so you won't.
No it's not, Stable Diffusion, the entire model is 2 gigabytes. It's training data was millions of times larger than that. Unless you can tell me how to store an entire 512x512 image in the equivalent of about this much data: "0101001", then it's physically impossible for it to have remembered its training data.
(And that's if it had nothing better to do with the memory it does have, which of course it is actually using for other things, general patterns and concepts)
Obviously you can't copy something if you can't remember it
There is and it's called copyright. It also covers distribution, display, adaptation and derivative works so not just direct copies.
Just because stable diffusion doesn't contain every pixel doesn't mean it doesn't know how to recreate content. It's built around rebuilding images from static so the ability to reconstruct images is a core part of the technology.
That's copying. You'd literally have nothing to distribute, obviously, without having made copies, lol. Not relevant to AI which doesn't copy anything.
No copyright law does not cover display of non-copies. Where did you get that idea?
Copying. Not relevant to AI which doesn't copy anything.
Copying. Not relevant to AI which doesn't copy anything.
It doesn't contain ANY pixels, not just "not all pixels". It completely and utterly does not remember its training images so very obviously cannot physically copy them.
It only remembers things that showed up over millions of different images which are fundamentally not copyrightable. With the possible sole exception of watermarks that may have been in millions of images and thus actually remembered verbatim. If it makes one of those, in that one instance it might be a copyright violation. So just delete any from your generated images, and you're good.
Copyright and licensing for business use doesn't just include copying directly. It includes any use providing a benefit to the company using them. In this case the models are entirely dependent on the source data and the models are being used to generate profit. Copyright on the generated output is an entirely different issue.
Nope, it literally only involves copying.
It's just that there weren't really any other coherent possible ways to make money from an artistic IP other than making copies. So "non copy related profiteering" never really comes up or didn't prior to AI at least.
The only example I can think of is if you have an original instance of a famous artwork in your lobby or front plaza, and it drives foot traffic. Which is totally legal though whether the artist likes it or not.
Or directly selling an original piece for raw cash. Also totally legal regardless of the artist's wishes if you fully own the item.
In my opinion, use in genAI training should require getting a license to use it that way, not just purchasing something to own for personal use. Me buying an mp3 of a song doesn't give me the right to use it in a TV show or something. GenAI companies training off images/music/etc are objectively not personal use.
An aspiring musician buying a bunch of songs, listening intently, then making his own song (drawing inspiration/example) is perfectly within his rights to do so, absent plagiarism.
Same is true for GenAI companies, provided they did legally buy the images, music, etc.
I have to ask, do pros really not understand why one artist would not want their work fed into an LLM?
Doesn't matter, it's not up to you and you don't have the right to demand everything about your art and how its ever viewed or used by anyone ever. You have only a right to it not being COPIED. That's it.
If I buy your art and decide to use it as toilet paper and you don't like it, tough noogies. You don't have the right to say people can't use your art as toilet paper.
If I walk past your art on a billboard and want to LOOK at it and learn general concepts from it but not copy it: tough noogies, not up to you if you don't like it. Don't put it out in public then.
You only have a right to not have it copied, which AI doesn't do.
We know you don't like it, we just don't care because that holds no legal weight (nor moral weight. You don't own how I live my life). I would prefer you dress and act a specific way probably too and stopped certain bad habits if I knew you well, doesn't mean I have a right to demand any of that.
You said a lot and I respect your perspective but none of it answered my question.
I answered it between the lines: I DO understand why, but I just don't have a reason to care/it doesn't matter. I don't have to live my life doing everything you prefer at all times when you have no right to have your preferences dictate my actions on the topic.
You don't want it, I understand why, and I have no reason to necessarily choose to respect that desire. Not that I dislike you, I just have my own life to live and you're not the boss of 99% of it.
Well, I didn’t ask all that lmao. I appreciate you answering though, I just wasn’t sure.
I don't get the issue here. If you ask "Why does X person not understand Y?" and someone says "They do understand"
...then that is an answer. You'll never get the thought process answer you expected, because your question just made a false assumption to begin with. There is no "why" to them "not understanding", because they DO understand.
I only ask a question when I do not know the answer. If I did know the answer, I wouldn’t have asked 🤷🏿♂️ *plus your answer wasn’t “they do understand” it was “it doesn’t matter”
emphasis added ^
Regardless, we are on the same page about it now, at least
Because I don’t think we’re going to make any progress here, I’m going finish by pointing out again, that:
“We know you don’t like it, we just don’t care”
Does not answer the question
“Do you not understand why someone wouldn’t want this?”
You used quote marks for yourself, but then didn't actually quote yourself. Your question was, in actual quote:
^ "No, they don't 'really not understand', they do in fact understand just fine" is a 100% valid answer to that question. Logically, and grammatically. You asked a yes or no question.
"Do they not understand?"
"No, they DO understand"
as much as i think the "stealing argument" is more about ethics than legality, this is just wrong. most subscription based platforms just allow users to view the content and MAYBE use it for personal things. just because someone subscribes to MeatCanyons patreon doesnt mean its giving them an unlimited royalty free license over all the content on there to use however they want for the rest of their life.
so what are you doing with the trained model then? just letting it sit on the self collecting dust?
someone could very much argue if youre using that Ai model to create content for your art page or are selling that Ai model on Civitai or somewhere you are using the paywalled content for commercial purposes. that would go against patreons TOS for example
"Subject to these terms and full payment of all applicable charges, to the extent a subscription or offering includes access to one or more of a creator’s creations, you receive a non-exclusive, non-transferable, non-sublicensable, revocable, limited license to access and view those creations for your own private, personal, non-promotional, non-commercial use"
Copyright from images doesnt transfer to model. Generalized concepts, ideas, facts are not protected by copyright. Styles are not protected.
Training AI on personal computer is private and personal use.
sure but copyright law doesn't protect every single use case. again if you're training off MeatCanyons paywalled content and your using the content to make images that are extremely similar to his or are selling the model using his name that isn't protected
and again what are you doing with the resulting model?
Ai is a very new industry and laws are changing specifically because of Ai. as the idiom goes "Don't count your chickens before they hatch"
Song lyrics are protected, style is not. Impersonation is illegal, using someone's style is not because style is not protected.
again, laws are changing because of Ai. Don't count your chickens before they hatch
We will see how it goes.
You don't NEED a license to learn patterns from looking at something but never actually copying it. So whether anyone gave you a license or not is completely irrelevant.
Licenses always and only pertain to copying, not looking and learning from looking. That's why it's called "COPYright"
I can even legally look over my friend's shoulder and watch their netflix show they subscribed to but I didn't, learn things about tv and movies from it, and apply those general concepts to my own screenplay without ever copying. 100% legal. Same for AI
nice, another link into my "use in case of legal arguments" folder
Honestly, it’s mostly the fact that they aren’t being asked. Every use of a copyrighted work that will result in profit for someone besides the creator has to be cleared with the copyright holder. I can’t just make a Dark Tower cartoon for money without getting permission.
Pros make the argument that it’s just like how humans learn, but here’s the thing - everything you learn comes from a source where the people who made the thing teaching you have given permission for it to be used that way. You don’t always directly pay for it - public schools don’t cost money on a per person basis, but are still paid for with taxes - but there’s always payment and permission from the original creator.
So, just randomly scraping the Internet for training data ignores getting permission from the creators. Many of them would probably give permission, maybe even free of charge, but they deserve the right to make that decision instead of Sam Altman.
Also, since one of you are going to bring up TOS, the people complaining about that don’t have a leg to stand on, because they entered into that agreement. That’s not a valid complaint.
They don't randomly scrape the internet though, this is a persistent straw man. You think someone is going to spend millions of dollars of GPU time training a model on random data? No. Dataset curation is an art and science, it's the secret sauce that makes the difference between the end result being useful or not. Actually synthetic datasets (datasets generated by other AIs) are gaining popularity because they perform better and are more energy efficient (despite the "model collapse" copium) - and it's easier to control their content. Ever since the scandal with the original stable diffusion and LAION dataset people are MUCH more careful and intentional about how they build a dataset. You can learn a lot about this on Huggingface, open source AI devs show their work of what goes into their datasets and why.
Okay, fine, but why do they not have to get express permission to use it as training data, like they would need if they were teaching humans?
A wannabe writer who devours novels from the library, then writes a novel himself, doesn't owe the authors of those novels a damn thing, absent plagiarism.
Libraries buy books. Therefore, they have the permission to loan them out.
You don’t really understand anything about how the entertainment industry works, do you?
lets not beat around the bush
even if an ai company purchased one single copy of all of its training data yall would not care… this is all a red herring
there are models that were developed entirely with licensed data (adobe) and you still get shit on and attacked for using it.
I can’t speak for everyone, but for me, this is one of the main problems with it.
And a person who checks out books from the library can do so with any intention they please. You have no concept of the English language, do you?
A person can check out books from the library with the intent to train himself to write. Moreover, the library, assuming it acquired those books legally, doesn't owe the creators anything more. The authors may protest that the guy checking books out is using them to train himself, but tough shit - the library and the patron are acting completely within the bounds of the law.
I understand you don't like the law, but that doesn't change anything.
Do you think I’m against libraries? Because if you do, you have a misunderstanding of what I’m saying.
Judging from your other responses, that’s honestly not surprising, because you don’t have the brain power to be having this conversation.
🤣
Well, fair use covers education, the news, satire, etc. fair use is important to a free and open society. You don't necessarily need permission of the copyright holder for educational purposes. Check the TEACH act section 110. It's not automatic, you have to balance several factors but one of those factors is the amount used. In the case of a machine learning model, one artist's work is a minescule drop in the bucket of the entire model, so small that if you removed it and retrained the model it would be difficult to measure the change. Several social justice organizations like Creative Commons argue passionately that defending fair use in this way for AI is crucial to make sure everyone gets the benefit of AI, not just big corporations.
https://www.regulations.gov/comment/COLC-2023-0006-8735
Also see the library copyright alliance
https://www.regulations.gov/comment/COLC-2023-0006-8452
And what some actual artists have to say about it
https://www.regulations.gov/comment/COLC-2023-0006-8426
Oh, textbooks don’t cost money?
You're not getting it. Fair use doctrine allows the authors of the textbooks to use copyrighted content without permission or fees under most circumstances.
Does teaching a human cost money?
That's not true. See the mountain of fair use exceptions.
Plus there is a whole lot of information in a copyrighted work that is “free for the taking” and doesn’t need even a fair use exception. Ideas, facts, etc are not protected by copyright
this
This subreddit mentions fair use incorrectly quite often. It never applied to training ai because it doesn’t need to
Sure, sure, but that doesn’t change the point - AI companies don’t have the permission to use it to train AI. I’ve seen pros make the argument that it’s similar to human learning, and everything we learn comes with permission from the creators. So, why shouldn’t it be the same for AI?
So when people show 10 second clips from movies in their movie reviews on youtube, do you think they asked the studio for permission?
It’s actually five seconds, but that’s fair use.
AI have to use the entire work for their training data, even if it’s just pixels or whatever.
A young wannabe director buys all the Star Wars material he can and studies it. He then writes his own space opera fantasy. Lucasfilm can't sue him (barring plagiarism), claiming that they didn't give him permission to train on those materials. He bought them fair and square, and can draw inspiration from their content.
You’re making my point, dude.
That person who watched Star Wars usually paid in some way to watch it, and the creator of that work was compensated. If the person is not copying it, then yeah, that’s fair. That was literally how George Lucas started - paying for movies and books, being inspired by them, and then creating his own work. He went to school to learn filmmaking, which he paid to do.
See that magical word there: pay. That gives you permission to train off the work. Does Sam Altman get permission from the creators?
If Sam Altman paid for the work, he doesn't need permission. If he pirated the work and used it to train ChatGPT, then he is in violation of copyright law. But, if he just pirated it for his personal amusement, he'd be just as guilty. Using legally acquired material to train AI is fine. Illegal acquisition of material is not fine, even if not used to train AI.
He does because it’s a different use case.
So, OpenAI is trained on Star Wars without permission. Okay, so now it can make videos with those characters. It’s not just the idea of a space opera - it’s literally that. You could make a Luke Skywalker/Chewbacca porn, right? And it would look official. That is what Sam Altman’s system that he charges people could do. Why does he get to do that for free?
Anyone could do it; AI just makes it faster and easier. You could photoshop Mark Hamill and Chewie into a porno without AI. You might well run afoul of copyright law, but you could claim parody or fair use - whether or not AI was involved.
I don’t understand this attitude that just because something exists, you should be allowed it to make money with it anyway you want, even if the people who actually own the thing don’t want you to use it.
Sam Altman is making money off a thing that uses other people’s property. He doesn’t reimburse them. Artists have fought for probably the entire time that art has existed to be paid.
You’re literally supporting a billionaire getting a free ride to another billion dollars and it makes no sense to me whatsoever. Some people say it’s because you guys come from a different philosophy, and that the tech world works differently, but it still makes no sense to me.
I may or may not support it. You keep stating that people who buy things legally should be prevented from doing certain things with them, even when those things are perfectly legal.
I am a lawyer; I certainly don't agree with all the laws currently in effect. But my agreeing is irrelevant; what's legal is legal and what's illegal is illegal no matter how vehemently I might object.
If something exists, Sam Altman (or you, or I) may do whatever we please with it, including making money off it, as long as we acquire it legally and produce something that is legal to make. Whether or not Sam, you, or I disagree with the morals/ethics of that is totally irrelevant.
Generalized concepts, ideas, facts are not protected by copyright.
Pieces of art and writing aren’t generalized ideas, though, they’re specific works. Machine learning uses these works to specifically teach AI to do something, using specific examples in bulk.
It learns this generalized ideas in works and after training model uses generalized ideas not works.
I learn generalized ideas from numerous sources, and those sources are used with permission. Why do billion dollar AI companies not have to do so?
I don’t even think you can back up 10% of anyone, including businesses pay for and credit original artists for works they make use of / learn from.
It’s so rare I see anyone credit use of reference images or closer to never I’ve heard that being compensated.
I write for a website. Every time I use an image, I have to attribute it to the owner of that image. It is used with their permission.
If machine learning is just like educating humans - which, again, I see pros say all the time - then permission has to be granted, because every thing a human uses to learn has been with permission.
You, I think, are speaking to legality. I for sure wonder if ethically you think you should be compensating every image you use.
Humans do not have permission of the explicit authorized kind to take inspiration from anything and make it their own story. Fair use grants that. That definition is making use of without permission.
But, again, all instances of fair use come from the permission of the copyright holder. That’s just how it works. We’ve had court battles over this.
For yet another time, I’m constantly told that machine learning is like human learning. The vast majority of things you learn are from places that the creator gave permission to, whether that be from getting paid or the express written permission of the creator.
So, why does billionaire Sam Altman not have to get permission?
I am speaking to AI developers using copyright materials with their tech. How that tech works matters, but it is humans doing the training management. So I’m comparing humans to humans in copyright.
Fair use does not come from permission of copyright holder. Good luck substantiating that. Fair use is essentially part of what copyright holder is up against if they feel misuse is occurring. Either it is infringement (misuse) and they have a case or it’s fair use and defendant essentially wins the case.
Why does billionaire Sam Altman not have to pay to use materials that I have personally have paid for (today in fact; I bought some comics to make my knowledge of comics better, because I write about comics) to educate his AI?
Can't you tell the difference between a specific image and a general concept?
The general category of “anime” is made up specific works, correct? Without those works, how would you describe anime? There is no generalized anything without specific works. There’s no generic category of anime.
You are beyond hope. You think your fantasy view of what the law should be is what it actually is.
A person who legally acquires animes, feeds them into GenAI and creates a new anime did not need permission from the creators to do so. The new product can't be plagiarism, but can be derivative. And if the person puts his own stamp on it, modifies it a bit, he has a tenable claim that he has created a new work, which he can copyright. The AI was just a tool. A photographer who takes a lovely image, then tweaks it via Photoshop does not lose copyright ownership over it.
Perhaps Congress or the US Supreme Court will change things, but as of now, your whole schtick about "you didn't have permission from the creators to train AI" is utterly ludicrous, if the material was legally acquired.
But here’s the thing you’re ignoring.
Let’s say I train my AI on manga of the last 50 years. I could create an Akira sequel that looks exactly like Katsuhiro Otomo and try to sell it or publish it. I could make the end of Berserk what I wanted it to be in Miura’s style. I can make Toriyama Dragonball comics. Now, if you’re doing that for yourself, nothing that can be done. People do that in whatever way they can everyday. If I service does allows you to do that, though, that service should have to pay for the right to do that. I’ve paid to read decades of comics. So, why doesn’t billionaire Sam Altman have to make a machine that can make comics?
Your "should" IS NOT THE LAW. Stop pretending that your idea of what "should" be is what actually exists. If Sam Altman legally buys Star Wars movies, feeds them into his Gen AI engine, gets a decent product, tweaks it to avoid plagiarism - he hasn't violated the law. Regardless of what you or I think the law "should" be.
But that’s not what’s happening.
He is using data that he got from Star Wars (an example) to make money. People can pay his service to do that, instead of the people who own Star Wars. That’s fundamentally wrong, and not all how the entertainment industry is structured.
"Fundamentally wrong." Well, that's just like, your opinion, man.
Until and unless Congress or the Supreme Court says otherwise. No doubt, the present state of affairs could have severe negative effects on the entertainment industry as currently constituted. However, no amount of wishing, hoping, scolding, finger wagging, or complaining from either of us changes the cold hard fact that people can feed Star Wars content (that they legally acquired) into GenAI, use the results to make and sell their own space fantasy, and, as long as it's sufficiently distinct, both those people and Sam Altman will get the money, not Disney/Lucasfilm.
Fundamentally wrong and unfair? Arguably. But so are a lot of other things - all of which, like my example, are perfectly legal. Ethically debatable, but fully legal.
No. Only COPYING is even 1% relevant to COPY-right.
No other use of artwork other than copying has anything to do with copyright. If I take your artwork and use it as toilet paper, tough luck, you have zero rights to tell me otherwise, because that's not copying it. That's wiping my ass with it. There's copy-right, but there is no wiping-ass-rights.
There's also no looking-at-and-learning-concepts-from-right. Which is what AI does. It never copies, it looks and learns. So copyright is just entirely irrelevant.
I can also make a billion dollars from it, and copyright has zilch to do with it, if I didn't at any point COPY it.
I don’t know why I’m surprised y’all don’t understand this.
When you learn something, the work you learned off is used with the permission of the creator and they often get paid for that. Learning isn’t free; the act is, but what you use to learn is taken from a medium where the creator has given permission to learn from it. The payment is often directly from the learner or through things such as taxes.
So, why doesn’t billionaire Sam Altman have to follow the same societal rules we do when it comes to learning?
Wrong, there is no "learningright"
There is "copyright".
No copying happened? Then the maker's opinions are 100% irrelevant. I can and do learn from 100 things every week without permission of the creators and they have no say in it nor should they. Pay-gating learning would be insane, society would crumble almost immediately.
Yes it is, show me what law you think restricts learning in any way, I'll wait. Please quote the specific text of the law that specifically covers learning.
There aren't any rules about learning, so he IS following the same non-existing non-rules that we also don't have to follow.
Well, we better start all getting money back from all the schools we’ve paid for then, shouldn’t we?
And they should get refunds on the textbooks and teachers’ salaries.
Wow, you cracked the code. Great job.
You pay taxes for schools because the teachers need salaries to show up to work, and the building needs to be built and repaired.
It has absolutely nothing to do with intellectual property, nor anything in my last comment. What on earth are you talking about? This is off topic and totally out of left field, not a reply at all to what I wrote.
And the textbooks? Were those free? And the supplies and the machines. Learning is not free, by any metric.
Why doesn’t billionaire Sam Altman have buy the equivalent of textbooks for his machine?
The textbooks are not free because in part they had to COPY images to print them.
COPYING is restricted, hence the name COPY-right.
I, looking at them, as a student, did not have to COPY anything to merely look at, think about, and learn from them. And am under no obligation to pay anyone anything for looking at an artwork. Nor is AI for looking at--but not copying--an artwork.
If my sibling and I are in the same grade in school, buy one textbook, and share it, no law is broken, no problem exists. Because we only needed one COPY, to have two instances of looking. The copying is what is relevant, not the looking
Why would he? He only needed to look, not copy, to do what he wanted to do, so he has no legal or moral reason to pay a dime.
I look at the textbook, not copy.
You’re really almost there, but you just can’t jump over that hurdle.
Someone paid for you to be taught. Why doesn’t billionaire Sam Altman have to pay for materials for his AI?
They paid 99% for salaries and facilities. They didn't have to spend anything on copyrights (the books are merely convenient for home study, but you could use the same images in lectures without textbooks as fair use and nobody pay anything for copyright, if you want)
Sam Altman paid for the facilities and teachers for AI to be taught as well. And similarly any copying was optional. he opted not to. My teachers also could have and often in fact did opt not to pay for copying as well. (many teachers frequently just showed stuff in lecture without any textbook accompaniment and paid nothing, under fair use)
Does it though, does it really? The answer is no, no it doesn't. Do you know how many artists will just look random shit up on google to use as references for their own works? Is that not the use of a copyrighted work?
This is why I fucking hate when people say 'use a copyrighted work', because they act like a copyright holder should have absolute control over every single possible use of their image. That's just not what copyright is for.
You can't, but you can make a cartoon with similar themes as Dark Tower.
What kind of absolute nonsense is this? Are you seriously trying to make the argument that the only time humans ever learn anything is when they are in a facility made for learning? The overwhelming majority of stuff we learn throughout our lives will be done not in a place like school, but just picked up randomly in our day to day lives.
If I go to twitter and look at someone's art, I'm learning from that image. Was there payment and permission from that original creator?
Just because you say this does not make it true. If they agreed to TOS allowing the host site to use the data as they wish then that is what they agreed to.
Yes, that was what I was trying to say - the people saying that their work shouldn’t be used in that (usually work without a copyright) don’t have a leg to stand on.
Ah I gotcha, seemed counter to the rest of the comment so I wasn't sure if I was reading it correctly.
It happens! I guess I wasn’t as clear as I thought.
>it’s mostly the fact that they aren’t being asked
What's the point of asking people who are already set to say no because they hate the technology for their own biased reasons?
If they were willing to say yes then there wouldn't be a need for a court to decide whether it's allowed or not.
Copyright laws define what you CAN'T do with someone else's work. Not what you CAN do. Barring those exceptions, everything else is fair game. Otherwise human culture wouldn't work, if everyone had to ask everyone else for permission in order to do anything.
That's why we have laws to adjudicate those things.
The only reason antis "want to be asked for permission" for something that doesn't need permission, is for a chance to deny that permission. It won't happen, though, because no permission is necessary in the first place.
I was actually speaking about the copyright holders of data that is being used to train AI. Plus, I would say there are plenty of artists (who I’ve spoken to in this forum) that would allow their work to be used.
What you’re saying is that because someone says no, you can just not ask them and do it anyway. That is very much a shitty attitude. Rapists think that way.
I'll repeat part of my post because it answers your concerns and you conveniently pretended it wasn't there and did not address it:
Copyright laws define what you CAN'T do with someone else's work. Not what you CAN do. Barring those exceptions, everything else is fair game. Otherwise human culture wouldn't work, if everyone had to ask everyone else for permission in order to do anything.
That's why we have laws to adjudicate those things.
The only reason antis "want to be asked for permission" for something that doesn't need permission, is for a chance to deny that permission. It won't happen, though, because no permission is necessary in the first place.
>muh rapists
Sex requires mutual consent. AI training does not. It's not that hard to understand. It's the same principle: a rape victim might say that consent is required, but the rapist says it is not. That's why we have laws to determine that it IS, in fact, required. Because sometimes people will claim things that are not true because it benefits them. Such as antis claiming that AI training is copyright infringement. Because they WANT it to be. It isn't, though. So we need to have a consensus on those things as a society, via courts of law. And we have decided that AI training is not copyright infringement, no matter how loud antis scream otherwise.
AI training should need consent, though, since all other learning comes from permission and AI companies are making money off it.
>all other learning comes from permission
It literally doesn't. You're not making sense.
I don't think you've thought this through.
Are you saying that if a math teacher decides that a kid is not allowed to learn math, then that kid shouldn't learn math in spite of this?
Just stop, dude.
The dude really IS a child of chimps, or has the IQ of one ...
What learning comes from permission? A young pianist studies Beethoven's sonatas; he doesn't need permission from Beethoven!
If I try to write a space opera inspired by all the Star Trek I have watched, I don't need permission from Paramount. I can't plagiarize, of course. But they have absolutely no legal right to say that I didn't have permission to train myself when consuming that Trek product.
If I remember correctly from the lawsuit breakdown, the judge ruled fair use because the LLM wasn’t directly competing or plagiarising the author's work (that they could prove). That’s not necessary case with visual works. It's still an automation tool targeted at replacing artists and devaluing their skills. And while it’s getting better at not “directly” plagiarising, it still is a concern.
The problem people have (illegal or not) is that tech companies are enriching their products off other people's creative works, without their permission.
So an aspiring fantasy writer buys copies of LOTR and several other famous works in the genre. He reads and studies them, then writes his own fantasy story inspired by them.
Perfectly legal, unless there's outright plagiarism.
And it's just as legal when AI does it, barring plagiarism. A lot of people refuse to accept that, but it doesn't change the legality, as you appear to recognize.
This is a very common comparison; the issue is that AI isn't a person. It's not some guy who really likes LOTR and wants to tell a similar story. It's a technology owned by companies that indirectly uses other people's IP for commercial purposes. But, yes. Under the right context, it can be fair use.
If the guy in my hypo worked for a publishing company, and he studied LOTR etc. because his company wanted to kick off their own fantasy series, it would be fine - absent plagiarism of course. In this variant, GenAI is the employee. Feeding it legally acquired materials and generating a new fantasy story for profit is fine, absent plagiarism.
If it creates a novel about a "Hubbit" that lives in the "Shyre" and is named "Froyo Braggins", then yes, there are plagiarism problems - but that could happen with a human writer too (see most fanfics).
I hear you, but I think there is a fundamental difference between a person working to spec and Writing Bot 9000. That's okay, we're allowed to disagree. I'm glad we see eye to eye on plagiarism being bad.
The AI did plagarism though. Do you accept that?
If an AI truly commits plagiarism, then its owners need to be held accountable for that.
I would say the users who made the infringing material and went on to profit from it should be held accountable.
If I ask AI for a picture of Sonic and Sonic comes out, that's like "asking" Photoshop (via clicks) to represent a line here, a line there, a color fill here, a gradient fill there, and Sonic comes out. My use of that final resulting image is what infringes. It's not the tool's fault that I directed it to create that.
In the legal decision in the original post, the court ruled there was no plagiarism - therefore, I don't accept it in this particular case.
What about cases like suno that reproduced famous songs? Or where it happened in all major imagegen products I know of. Do you think they were fairly held accountable or got off the hook?
Don't know what happened in those cases and don't really care. I reaffirm my previous statement that AI plagiarism is just as intolerable as any other plagiarism; however, training AI on legally acquired materials violates no laws, no matter how many people are offended by the idea.
uhh but training on legally acquired materials directly led to plagarising some of those materials. There's plenty more cases for us to look at.
Seems like something you should care about. Unless you refuse to accept things that contradict your view.
Humans trained on legally acquired materials can and do commit plagiarism; why is there such fear of AI?
The problem, for the original creator, is the plagiarism - does it matter whether the culprit was a fellow human or GenAI?
Focus on the plagiarism and on compensation to the original creator for resulting damages; don't stress over whether the offender was a human being or a GenAI.
Unlike humans, AI art is dominated by products that plagarised, and they still get defended completely by aibros.
Apparently it matters to people like you who just outright say "I don't know or care" about cases of AI plagarism. You are the one refusing to accept AI plagarism. I have no problem accepting humans plagarised. I don't know how on earth you are getting that I have some weird focus on the culprit.
I accept that AI can absolutely be used for plagiarism; in that case, its owners must be punished. I reject the idea that GenAI is automatically plagiarism.
Offering a tool that can do a variety of things is not direct competition with a specific artist.
This legal question was covered in the famous Betamax case, Sony vs. Universal.
https://www.newyorker.com/magazine/1987/04/06/i-the-betamax-case
An artist making their own art and selling it is not in the same market as a company offering a tool that others can use to make art. The company is not personally generating art to sell and cut in on their specific market.
The person you would pursue for directly competing with the artist is a user of the gen AI who decides to generate art to compete with that artist. Few users will do this, there are many uses for AI that don't necessarily involve this.
Definitely gave me some things to think about, but I'm not an 100% on this one. I get the Betamax / staple-article point about tool-makers not being automatically liable for how users behave, but that doesn’t really address what artists are worried about here. Even if the company “just” offers a tool, that tool is explicitly marketed and used as a cheaper, faster substitute for hiring artists, so in practice it is competing for the same commissions and budgets, just via a different business model. And unlike a VCR, these systems were built by scraping and learning from artists’ work at scale, without consent or compensation, which is a separate issue from whether a specific user later infringes. So the question isn’t only “who do we sue for one infringing image,” it’s also “is it acceptable to build a commercial product’s value on other people’s creative labour without their permission, then sell that product into the same markets they depend on?
But the work was used legally, if "used" is even the term for it, since their work is not contained in the final model. Consent and compensation are not required when nothing is taken.
Scraping is already legal, and the end result of a work being examined for a model is only a few bytes of non-infringing information, which the artist certainly has no claim over. Like if I look at someone's drawing of a man and write down "brown hair" in order to add that simple fact of "brown hair" to thousands of entries in a giant database, the artist cannot claim that they have exclusive ownership over the concept of "brown hair."
Even if it does fall under fair use, the impact on the market is only one of the four factors, and no one factor is dominant over the others. One of the factors is "the amount and substantiality of the work taken," and that amount is zero. No amount of the work makes it into the final model.
So the training data sets and the AI models are two different animals, and I treat them separately personally. I have to push back on the idea that “nothing is taken”. To train these models, companies do make full copies of artists’ works (at least temporarily), store them, and extract value from them via processing/using/training. All without giving anything back. If they weren’t such leeches, I wouldn’t care as much. For context, training a model on your own work is fine, but I take issue with companies having the auto-opt-in mindset.
While the end AI model doesn’t have a folder called “stolen work lol”, it is retaining data; if it didn’t, there would be no reason to train it. It just does it abstractly. (from what I understand, I'm not an AI engineer). Might add more when I get a spare minute. But I think the issue is nuanced.
Replacing artists isn't illegal or immoral. Replacing difficult expensive labor is a good thing. Do you own a washing machine? If so, why did you invest in something that could have employed a human laborer instead
As a human, even 300 years ago, or 50 years ago, or right now, never needed your "permission" to enrich myself off of vague concepts I learned from merely looking at your artwork but not copying it.
That literally just isn't and has never been a thing that you get to dictate of other people. I never needed your "permission" for that. Not having "permission" that I never needed to begin with is not important.
It can be depending on the context.
Sometimes, that doesn't mean people can't disagree with it. Especially when the alternative only works off the hard work of the people they are trying to replace. A bit exploitative if you ask me.
AI isn't a person, and neither AI company. We hold companies and commercial products to different standards; it's a scale issue.
Even simpler then: Inanimate things never needed permission to do ANYTHING.