A leading news media association is accusing AI technology firms of using news content to train their chatbots without permission.

The News Media Alliance, which represents almost 2,000 media outlets in the U.S., released a study on Tuesday that showed that creators of generative artificial intelligence systems, such as OpenAI and Google, “have copied and utilized news, magazine, and digital media content to train” their chatbots. The study also showed that AI firms have taught their chatbots to trust more the information from those reputable publishers than the content from other places on the web.
“The research and analysis we’ve done reveals that AI firms and developers are not only copying our members’ content without authorization to train their products, but they are also using it extensively and more than other sources,” said Danielle Coffey, chief executive of the News Media Alliance, in a statement.
“This shows they acknowledge our distinct value, but most of these developers are not getting proper permissions through licensing deals or paying publishers for the use of this content,” Coffey added. “This reduction of high-quality, human made content hurts not only publishers but also the durability of AI models themselves and the availability of reliable, trustworthy information.”
Learned Facts
In the published white paper, the association also dismissed arguments that AI chatbots have just “learned” facts by reading various sets of data, like a human being would. The association said, “it is incorrect” to make such a claim “because models keep the expressions of facts that are in works in their copied training materials (and which copyright protects) without ever grasping any underlying concepts.”
Publishers, many of whom have been in a kind of Cold War with A.I. firms, have begun in recent months taking preventive measures to defend their content. In August, a Reliable Sources review found that a dozen major media firms have added code to their websites to protect their content from AI bots that scrape the web for information. And many more have done it since. But those preventive measures only shield news organizations from future scraping. The action would do nothing to deal with the previous scraping of their reporting, which the News Media Alliance — and others — claims to have been used to train AI chatbots.
To fix that problem, the News Media Alliance suggested recommendations for news publishers to protect them from disappearing in this new world. The recommendations include policymakers acknowledging that the unauthorized use of copyrighted material to train AI bots “is infringing” and that publishers should be able to “license the use of their content effectively and on fair terms.”
“Our culture, our economy, and our democracy need a solution that allows the news and media industry to grow and thrive, and both to share in the profit from and participate in the development of the GAI revolution that is being built on the fruits of its labor,” the News Media Alliance said.