@julka
I'm proposing the same level of scrutiny!
No, you're not. Everything else places the liability on the user. You are placing the liability on the provider. You are, by analogy, arguing that libraries should not exist because a user could copy the books.
Operating an LLM that returns copyrighted data on request should be illegal!
Then operating a web search service (Google, not a browser) that returns copyrighted data on request should be illegal, no? Yet you're not campaigning against Google, just the new technology. And Google (not the browser) is far better at returning copyrighted data on request than any current extent LLM.
Operating a torrent tracker is legal; operating a torrent tracker that serves copyrighted data is illegal!
This is incorrect. One can operate a torrent tracker freely, whether that tracker returns torrents that are copyrighted or not copyrighted. The liability is completely on the user of the torrent, not on the tracker operator.
a library's collection of copyrighted books is, as you've noted, governed by copyright law and one of the rights you get under that law is the right to lend or resell your physical copy of a book.
That's not the argument I'm making. I'm making the argument that the library provides a photocopier, which can be used to copy any of those books (without lending or reselling). By your argument, one should not be allowed to operate a library if, at any time, anyone infringes copyright while reading or borrowing a book. After all, the library 'returned copyrighted data on request,' did it not? And that's your standard above for what should not be allowed, is it not?
Why is it fine for the library to 'return copyrighted data on request', and for Google to 'return copyrighted data on request', but not for an LLM? Particularly because, of the three, the one with the lowest chance of returning copyrighted data is the LLM?
The reason nobody tries to sue a library out of existence is because they are doing something which is explicitly legal
Then what the AIs are doing is also explicitly legal. You can't have it both ways. The AI was legally trained with resources, and it is providing those resources. The library was legally stocked with resources, and it is providing those resources.
digital copies of works do not have the same rights associated with them
So, in theory, it's fine if the AI was trained by chopping up physical books (as some were) but not if it was trained by using electronic copies of the exact same book?
I agree - the current model by which electronic books is a scam, and electronic books should have the same rights associated with them. But, regardless of that, your argument seems to have shifted to the exact sourcing of the material used for training, regardless of whether it is the same material or not.
So, let's try that theory. Under that theory, an AI that was trained by 'reading' (scanning, etc) physical books is fine. No legal issues. But an AI that was trained by 'reading' electronic copies of books is potentially infringing and problematic.
And a library which only provides physical books inherits some sort of protection from contributing to copyright infrigmentment, notwithstanding that it is the library which knowingly and intentionally distributed the copyrighted material which was then infringed. But a library which provides ebooks (as much US libraries do) should be illegal and shut down if any of those ebooks is ever copied, in whole or part - even a tiny part! - by so much as a single user of that ebook.
Does that make sense to you? Because it makes no sense to me.
And I'll go back to a point you never responded to. The purpose of copyright, in the United States, is to 'promote the progress of science and the useful arts.' However, the entire thesis of your argument is that we should use copyright as a weapon to halt the progress of science and the useful arts, lest 'science and the useful arts' produce a thing that can (inaccurately) reproduce some small subset of a copyrighted work upon demand.
So, I will repeat: why is copyright a justification for going after LLMs, and not for banning torrent send/receive software (not trackers), VCRs, and other things that are far more likely to be used to violate copyright than LLMs are? The average LLM is used for a far higher percentage of non-infringing uses than the average torrent software is. Why should torrent software not be subject to the same level of scrutiny as LLMs (e.g. liability for the torrent software maker, not just the user)?
Why are you so concerned with the creation and operation of software with a very low probability of meaningful copyright infringement (LLMs) and totally fine with the creation and operation of things associated with far higher rates of copyright infringement (torrent software, VCRs, photocopiers, etc)? Or libraries, for that matter? In none of those cases are you claiming the provider/manufacturer/etc should carry liability, only the user who uses the tool to infringe.