Skip to content

Reddit Takes on Anthropic in AI Data Battle

  • 3 min read

In a bold move, Reddit has filed a lawsuit against Anthropic, an AI startup, for allegedly using the platform's data to train AI models without a proper licensing agreement. The complaint, filed in a Northern California court, accuses Anthropic of unlawfully exploiting Reddit's data for commercial gain, violating the platform's user agreement. This marks Reddit's first legal challenge against an AI model provider over its training data practices, joining a growing list of publishers and content creators who have taken similar legal action against tech companies.

The New York Times, for instance, has sued OpenAI and Microsoft for training their AI models on its news articles without compensation or permission. Similarly, authors like Sarah Silverman have sued Meta for training AI models on their books without approval. Music publishers and artists have also filed claims against AI startups in the audio, video, and image generation sectors, alleging misuse of their content.

Ben Lee, Reddit's chief legal officer, stated, "We will not tolerate profit-seeking entities like Anthropic commercially exploiting Reddit content for billions of dollars without any return for redditors or respect for their privacy." Reddit has entered into agreements with other AI model providers, such as OpenAI and Google, allowing them to train AI models on Reddit's data and display the platform's posts in their chatbots' responses. However, these agreements include terms that protect Reddit users' interests and privacy.

Sam Altman, OpenAI's CEO, holds an 8.7% stake in Reddit, making him the third-largest shareholder, and has previously served on the company's board of directors. According to the filing, Reddit informed Anthropic that it did not have authorization to scrape or use Reddit's content. Despite this, Anthropic allegedly "refused to engage."

Anthropic spokesperson Danielle Ghighlieri said in an emailed statement, "We disagree with Reddit's claims and will defend ourselves vigorously." Reddit's complaint alleges that Anthropic's scraper bots disregarded the platform's robots.txt files, which signal to automated systems not to crawl websites. The platform claims that even after Anthropic claimed to block its bots from scraping Reddit in 2024, the bots continued to scrape the platform more than 100,000 times.

Reddit is seeking compensatory damages from Anthropic, as well as restitution for the enrichment gained by scraping Reddit's content. The platform also requests an injunction to prevent Anthropic from further using Reddit's content. This legal battle highlights the ongoing debate over data usage and privacy in the AI industry, with major tech companies increasingly facing scrutiny for their data practices.

Leave a Reply

Your email address will not be published. Required fields are marked *