Meta Sued for Pirating Porn to Train AI Models

Futurism

A new lawsuit filed in California federal court by adult film companies Strike 3 Holdings and Counterlife Media alleges that Meta pirated nearly 2,400 copyrighted adult films to train its artificial intelligence models, including Meta Movie Gen and its large language model, LLaMA. The lawsuit, first flagged by TorrentFreak, claims that Meta began downloading and seeding this content via BitTorrent as early as 2018.

The plaintiffs assert that their infringement analysis and IP tracking tools identified 47 IP addresses associated with Meta, including one residential IP address of a Meta employee, as being involved in the downloading of their copyrighted content. They also noted “non-human patterns” in the data movement, suggesting the content acquisition was for AI training data. Strike 3 Holdings and Counterlife Media are seeking damages of up to $150,000 per stolen video, which could amount to $359 million if all 2,396 pieces of content are considered. The lawsuit also seeks the deletion of any copyrighted and pirated content and an injunction to permanently bar Meta from torrenting their work again.

This lawsuit adds to a growing number of legal challenges Meta faces regarding its AI training data. In 2023, authors, including Sarah Silverman, filed a similar class-action lawsuit against Meta, alleging that the company used pirated books from “shadow libraries” like LibGen to train its LLaMA models. Internal Meta communications, cited in court filings, suggest that Meta CEO Mark Zuckerberg approved the use of the LibGen dataset despite internal warnings that it was known to be pirated. These documents also indicate discussions among Meta employees about the risks and benefits of using copyrighted content and even ways to conceal how the company acquired its AI training data.

Meta has generally defended its use of copyrighted material for AI training under the “fair use” doctrine, arguing that its models do not redistribute the original works in a way that harms copyright holders. However, this defense is being rigorously tested in courts, with some rulings, such as in the Thomson Reuters v. ROSS Intelligence case, indicating that depriving a copyright owner of the ability to license their work as AI training data could undermine a fair use defense.

The broader landscape of AI development is currently a battleground for intellectual property rights. Over thirty lawsuits have been filed against AI companies in U.S. federal courts by copyright owners, alleging unauthorized use of their works to develop AI models. These cases involve various forms of content, including text, images, and video, and raise fundamental questions about compensation for creators, the scope of fair use in the context of AI, and the transparency of AI training data sources. A ruling against Meta in these cases could significantly impact how all AI companies train their models, potentially leading to increased licensing requirements, higher costs, and stricter regulations for the AI industry.