Anthropic wins key US ruling on AI training in authors’ copyright lawsuit, but should only have used legally bought books.

A federal judge in San Francisco ruled late on Monday that Anthropic’s use of books without permission to train its artificial intelligence system was legal under U.S. copyright law.
Siding with tech companies on a pivotal question for the AI industry, U.S. District Judge William Alsup said Anthropic made “fair use”
, opens new tab of books by writers Andrea Bartz, Charles Graeber and Kirk Wallace Johnson to train its Claude large language model.
Sign up here.
Alsup also said, however, that Anthropic’s copying and storage of more than 7 million pirated books in a “central library” infringed the authors’ copyrights and was not fair use. The judge has ordered a trial in December to determine how much Anthropic owes for the infringement.
U.S. copyright law says that willful copyright infringement can justify statutory damages of up to $150,000 per work.
An Anthropic spokesperson said the company was pleased that the court recognized its AI training was “transformative” and “consistent with copyright’s purpose in enabling creativity and fostering scientific progress.”
The writers filed the proposed class action against Anthropic last year, arguing that the company, which is backed by Amazon (AMZN.O) and Alphabet (GOOGL.O), used pirated versions of their books without permission or compensation to teach Claude to respond to human prompts.
The proposed class action is one of several lawsuits brought by authors, news outlets and other copyright owners against companies including OpenAI, Microsoft (MSFT.O) and Meta Platforms (META.O) over their AI training.
The doctrine of fair use allows the use of copyrighted works without the copyright owner’s permission in some circumstances.
Fair use is a key legal defense for the tech companies, and Alsup’s decision is the first to address it in the context of generative AI.
AI companies argue their systems make fair use of copyrighted material to create new, transformative content, and that being forced to pay copyright holders for their work could hamstring the burgeoning AI industry.
Anthropic told the court that it made fair use of the books and that U.S. copyright law “not only allows, but encourages” its AI training because it promotes human creativity. The company said its system copied the books to “study Plaintiffs’ writing, extract uncopyrightable information from it, and use what it learned to create revolutionary technology.”
Copyright owners say that AI companies are unlawfully copying their work to generate competing content that threatens their livelihoods.
Alsup agreed with Anthropic on Monday that its training was “exceedingly transformative.”
“Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different,” Alsup said.
Alsup also said, however, that Anthropic violated the authors’ rights by saving pirated copies of their books as part of a “central library of all the books in the world” that would not necessarily be used for AI training.
Anthropic and other prominent AI companies including OpenAI and Meta Platforms have been accused of downloading pirated digital copies of millions of books to train their systems.
Anthropic had told Alsup in a court filing that the source of its books was irrelevant to fair use.
“This order doubts that any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use,” Alsup said on Monday.

Source: Anthropic wins key US ruling on AI training in authors’ copyright lawsuit | Reuters

This makes sense to me. The training itself is much like any person reading a book and using that as inspiration. It does not copy it. And any reader should have bought (or borrowed) the book. Why Anthropic apparently used pirated copies and why they kept a seperate library of the books is beyond me .

Judge Denies Creating ‘Mass Surveillance Program’ Harming All ChatGPT Users after ordering all chats (including “deleted” ones) be kept indefinitely

An anonymous reader quotes a report from Ars Technica: After a court ordered OpenAI to “indefinitely” retain all ChatGPT logs, including deleted chats, of millions of users, two panicked users tried and failed to intervene. The order sought to preserve potential evidence in a copyright infringement lawsuit raised by news organizations. In May, Judge Ona Wang, who drafted the order, rejected the first user’s request (PDF) on behalf of his company simply because the company should have hired a lawyer to draft the filing. But more recently, Wang rejected (PDF) a second claim from another ChatGPT user, and that order went into greater detail, revealing how the judge is considering opposition to the order ahead of oral arguments this week, which were urgently requested by OpenAI.

The second request (PDF) to intervene came from a ChatGPT user named Aidan Hunt, who said that he uses ChatGPT “from time to time,” occasionally sending OpenAI “highly sensitive personal and commercial information in the course of using the service.” In his filing, Hunt alleged that Wang’s preservation order created a “nationwide mass surveillance program” affecting and potentially harming “all ChatGPT users,” who received no warning that their deleted and anonymous chats were suddenly being retained. He warned that the order limiting retention to just ChatGPT outputs carried the same risks as including user inputs, since outputs “inherently reveal, and often explicitly restate, the input questions or topics input.”

Hunt claimed that he only learned that ChatGPT was retaining this information — despite policies specifying they would not — by stumbling upon the news in an online forum. Feeling that his Fourth Amendment and due process rights were being infringed, Hunt sought to influence the court’s decision and proposed a motion to vacate the order that said Wang’s “order effectively requires Defendants to implement a mass surveillance program affecting all ChatGPT users.” […] OpenAI will have a chance to defend panicked users on June 26, when Wang hears oral arguments over the ChatGPT maker’s concerns about the preservation order. In his filing, Hunt explained that among his worst fears is that the order will not be blocked and that chat data will be disclosed to news plaintiffs who may be motivated to publicly disseminate the deleted chats. That could happen if news organizations find evidence of deleted chats they say are likely to contain user attempts to generate full news articles.

Wang suggested that there is no risk at this time since no chat data has yet been disclosed to the news organizations. That could mean that ChatGPT users may have better luck intervening after chat data is shared, should OpenAI’s fight to block the order this week fail. But that’s likely no comfort to users like Hunt, who worry that OpenAI merely retaining the data — even if it’s never shared with news organizations — could cause severe and irreparable harms. Some users appear to be questioning how hard OpenAI will fight. In particular, Hunt is worried that OpenAI may not prioritize defending users’ privacy if other concerns — like “financial costs of the case, desire for a quick resolution, and avoiding reputational damage” — are deemed more important, his filing said.

Source: Judge Denies Creating ‘Mass Surveillance Program’ Harming All ChatGPT Users

NB you would be pretty dense to think that anything you put into an externally hosted GPT would not be kept and used by that company for AI training and other analysis, so it’s not surprising that this data could be (and will be) requisitioned by other corporations and of course governments.

Scientists use bacteria to turn plastic waste into paracetamol

Bacteria can be used to turn plastic waste into painkillers, researchers have found, opening up the possibility of a more sustainable process for producing the drugs.

Chemists have discovered E coli can be used to create paracetamol, also known as acetaminophen, from a material produced in the laboratory from plastic bottles.

“People don’t realise that paracetamol comes from oil currently,” said Prof Stephen Wallace, the lead author of the research from the University of Edinburgh. “What this technology shows is that by merging chemistry and biology in this way for the first time, we can make paracetamol more sustainably and clean up plastic waste from the environment at the same time.”

Writing in the journal Nature Chemistry, Wallace and colleagues report how they discovered that a type of chemical reaction called a Lossen rearrangement, a process that has never been seen in nature, was biocompatible. In other words, it could be carried out in the presence of living cells without harming them.

The team made their discovery when they took polyethylene terephthalate (PET) – a type of plastic often found in food packaging and bottles – and, using sustainable chemical methods, converted it into a new material.

When the researchers incubated this material with a harmless strain of E coli they found it was converted into another substance known as Paba in a process that must have involved a Lossen rearrangement.

Crucially, while the Lossen rearrangement typically involves harsh laboratory conditions, it occurred spontaneously in the presence of the E coli, with the researchers discovering it was catalysed by phosphate within the cells themselves.

The team add that Paba is an essential substance that bacteria need for growth, in particular the synthesis of DNA, and is usually made within the cell from other substances. However, the E coli used in the experiments was genetically modified to block these pathways, meaning the bacteria had to use the PET-based material.

The researchers say the results are exciting as they suggest plastic waste can be converted into biological material.

“It is a way to just completely hoover up plastic waste,” said Wallace.

The researchers then genetically modified the E coli further, inserting two genes – one from mushrooms and one from soil bacteria – that enabled the bacteria to convert PABA into paracetamol.

The team say that by using this form of E coli they were able to turn the PET-based starting material into paracetamol in under 24 hours, with low emissions and a yield of up to 92%.

While further work would be needed to produce paracetamol in this way at commercial levels, the results could have a practical application.

“It enables, for the first time, a pathway from plastic waste to paracetamol, which is not possible using biology alone, and it’s not possible using chemistry alone,” Wallace said.

Source: Scientists use bacteria to turn plastic waste into paracetamol | Drugs | The Guardian