
In light of Anthropic's recent $1.5 billion copyright settlement, the artificial intelligence sector is grappling with significant challenges surrounding the use of training data. With around 40 additional lawsuits pending over unlicensed data usage—including one targeting Midjourney for producing images of Superman—the urgency for a robust licensing framework has never been greater. Without such a system, AI companies could face a wave of copyright lawsuits that could potentially hinder the industry's growth for years to come. To address this pressing issue, a coalition of technologists and web publishers has unveiled a new initiative named Real Simple Licensing (RSL). This ambitious project aims to facilitate large-scale data licensing, provided that AI firms choose to participate. Major platforms such as Reddit, Quora, and Yahoo have already expressed their support for RSL. The critical question remains: will this momentum persuade leading AI labs to engage in negotiations? Eckart Walther, co-founder of RSL and co-creator of the RSS standard, emphasized the need for machine-readable licensing agreements on the internet. "That’s really what RSL solves," he stated in an interview. The RSL initiative marks a significant step towards establishing a technical and legal infrastructure for data licensing, a goal pursued for years by organizations like the Dataset Providers Alliance. Technically, the RSL Protocol delineates specific licensing options that content publishers can set, allowing AI companies to either negotiate custom licenses or adhere to Creative Commons terms. By integrating these terms into their "robots.txt" files in a standardized format, participating websites can clearly communicate the licensing conditions associated with their data. On the legal front, the RSL Collective has been formed to negotiate terms and collect royalties, much like ASCAP does for music or MPLC for film. This centralized approach aims to simplify the process for licensors, enabling them to define terms with multiple potential licensees simultaneously. A variety of well-known web publishers, including Yahoo, Reddit, Medium, and others, have already joined the collective, while some, like Fastly and Quora, have shown support without direct membership. Interestingly, the RSL Collective includes publishers with existing licensing agreements, such as Reddit, which reportedly earns around $60 million annually from Google for its training data usage. While companies can negotiate individual deals within the RSL framework, this collective approach may be the only avenue for smaller publishers who lack the clout to secure their own agreements. However, determining when royalties are owed for specific pieces of training data in AI models presents unique challenges. For instance, Google's AI Search Abstracts effectively tracks data sourced from the web in real time while ensuring attribution for each fact. Conversely, without proper logging during the training process, verifying that a specific document was incorporated into a large language model (LLM) can be exceedingly difficult. Despite these complexities, the creators of RSL are optimistic that AI companies can adapt. Doug Leeds, a co-founder of RSL and a former CEO of IAC Publishing, remarked, "Some of the licensing agreements they’ve already done have required them to be able to report on it, so it’s possible. It doesn’t have to be perfect. It just has to be good enough to get people paid." The larger question looms: will AI companies be willing to adopt this new system? While firms like ScaleAI and Mercor are already investing in quality data, the web has often been viewed as a repository of inexpensive, low-quality information. With freely available datasets like Common Crawl, convincing companies to pay royalties could prove challenging. Recent disputes, such as the one between CloudFlare and Perplexity, highlight the complexities of differentiating between web-scraping and machine-enhanced browsing. Leeds pointed to statements made by prominent AI figures, including Sundar Pichai, advocating for a licensing framework like RSL. Whether these calls are genuine remains to be seen, but the RSL team is determined to follow through on the momentum they've created. "They have said outwardly to everyone, something like this needs to exist," Leeds stated. "We need a protocol. We need a system." With RSL now in play, the industry may be on the cusp of significant change.
Following his departure from AMD Silo AI, where he served as CEO after a significant acquisition, Finnish entrepreneur P...
TechCrunch | Mar 13, 2026, 05:20
In a significant corporate shift, Adobe has announced that its CEO, Shantanu Narayen, will be stepping down once a succe...
CNBC | Mar 12, 2026, 20:25
Google Maps is set to revolutionize the way users navigate their surroundings with the introduction of innovative AI-dri...
Business Today | Mar 13, 2026, 06:00
Rivian has unveiled the specifications and pricing details for its highly anticipated R2 SUV, but customers eager to pur...
TechCrunch | Mar 12, 2026, 21:00
In a surprising twist amidst widespread layoffs across various industries, Elon Musk, CEO of Tesla, has announced plans ...
Business Insider | Mar 13, 2026, 04:25