Generative AI Training Battles Lost And Won: Bartz V. Anthropic
If you haven’t caught this bit of news, Anthropic, the company behind the AI chatbot Claude, has gotten itself into murky water when it comes to what is and isn’t okay to use to train its AI.

We thought we’d break it down a little for you, as it’s particularly important to the small independent book publishers and authors that we love!
To sum it up, in this most recent lawsuit over the use of books to train generative AI, U.S. District Judge William Alsup of San Francisco determined that it was lawful to use books to train AI models BUT ONLY IF the books were acquired lawfully. During the proceedings, it was discovered that this massive AI company bypassed all the legal channels and opted instead to download
millions of books off of pirating websites such as Library Genesis and PiLiMI. Because the books were NOT acquired lawfully, the court determined that this does NOT qualify as fair use and Anthropic MUST make restitution. Anthropic is now attempting to settle this class-action copyright lawsuit to the tune of 1.5 billion U.S. dollars, for which the court will determine if that amount is acceptable in the coming days and potentially months.
To add another important detail, as part of the settlement, Anthropic will also be required to destroy all of the books that they originally downloaded, meaning this settlement is not a payment to allow them to use the material, but a settlement for damages. For authors, this means that if they don’t wish to share their book with an AI company for the purpose of training their chatbots, they don’t have to (or they can set the price).
So, why is this such a big deal? We wanted to give our two cents: AI companies like Anthropic are billion dollar companies that are chewing up data and spitting it out faster than chewing gum. They need to ingest thousands and thousands of books to acquire all that data. The thing is, the writers of those books are not billionaires. Most of our clients are small publishers or authors just making an honest living sharing their creative talents with the world. When that talent is used to feed a billion dollar enterprise, it may not seem like a big deal to other people since people want AI to know everything, so it needs to know everything about everything in order for it to do its job, right? If the chatbot is just “reading” the book, how is that different from a person just reading the book?
Well, any individual person who is reading a book (that they bought or borrowed from a legitimate library) is personally gaining that information. Generative AI is “reading” that book in order to remix and regurgitate the information back out to potentially millions of people for a profit. From a business standpoint, of course Anthropic doesn’t want to absorb the time and expense involved to reach out to each individual author for permissions, but there is certainly a better way of going about getting the rights to those books legally. Publishers and authors are pushing for appropriate language in author contracts to allow for opt-in or opt out of training LLMs. And models already exist to legally permit use of content to train generative AI.
Check out the new collective license for AI training from the Copyright Clearance Center (CCC) as one such example.
If you are an author and are concerned that your written works may have been used by Anthropic,
you can check out more info about the settlement and submit your contact info here.
The Author’s Guild also has some great information
about this copyright settlement and Frequently Asked Questions about the process, particularly focused on providing authors with the info they need to know to determine if they should be a part of the class-action lawsuit.
We will do our best to keep you posted on the results of this lawsuit settlement, as well as the more than 40 other lawsuits winding through the U.S. court system also revolving around the use of book content in AI training. Until then, Authors: 1, Generative AI: 0!