“这是非法的”：加拿大媒体起诉 OpenAI“抓取大量内容”来训练聊天机器人

“这是非法的”：加拿大媒体起诉 OpenAI“抓取大量内容”来训练聊天机器人
"It's Illegal": Canadian Media Sues OpenAI For 'Scraping Large Swaths Of Content' To Train Chatbot

原始链接: https://www.zerohedge.com/political/its-illegal-canadian-media-sues-openai-scraping-large-swaths-content-train-chatbot

五家加拿大媒体公司指控 OpenAI 侵犯版权，涉嫌在未经许可的情况下抓取文章来训练 ChatGPT。他们要求每篇使用的文章赔偿 20,000 美元的损失、分享 OpenAI 的利润，并停止使用未经授权的内容。该诉讼凸显了围绕人工智能训练方法的道德问题和法律灰色地带。 OpenAI 在此前的诉讼中也曾被指控侵犯类似的版权，例如《纽约时报》提起的诉讼。专家表示，由于技术困难，证明版权侵权可能具有挑战性，并强调需要对人工智能生成数据的使用进行监管。

Five Canadian media companies are suing OpenAI, alleging that the ChatGPT creator has breached copyright and online terms of use in order to train the popular chatbot.

The joint lawsuit, filed on Friday in the Ontario Superior Court of Justice, follows similar suits brought against OpenAI and Microsoft in 2023 by the New York Times, which claimed copyright infringement of news content related to AI systems.

The Canadian outlets - which include the Globe and Mail, the Toronto Star and the Canadian Broadcasting Corporation (CBC), are seeking what could amount to billions of dollars in damages, as they have demanded 20,000 Canadian dollars (US$14,700) for each article they claim was illegally scraped and used to train ChatGPT.

"OpenAI is capitalizing and profiting from the use of this content, without getting permission or compensating content owners," the group said in a Friday statement, adding that they're responsible for the "bulk of Canada’s journalistic content."

The plaintiffs are also seeking a share of OpenAI's profits, as well as a halt to the use of future content.

"OpenAI regularly breaches copyright and online terms of use by scraping large swaths of content from Canadian media to help develop its products, such as ChatGPT," the group said in a statement.

"OpenAI’s public statements that it is somehow fair or in the public interest for them to use other companies’ intellectual property for their own commercial gain is wrong," they added. "Journalism is in the public interest. OpenAI using other companies’ journalism for their own commercial gain is not. It’s illegal."

The lawsuit claims that OpenAI circumvented specific technological and legal tools - such as the Robot Exclusion Protocol, copyright disclaimers and paywalls, which exist in part to prevent scraping or other types of unauthorized use of their published content.

In the NYT vs. OpenAI and Microsoft case, which is currently in discovery, the Times claims the company similarly broke laws to train ChatGPT, as well as provide search results.

The Canadian case has a narrower focus - scraping data for training - not search results, and does not name Microsoft.

"We believe we have a strong case related to the training of the models. The training of the models is the core of the problem," said Sana Halwani, a partner at the Canadian law firm Lenczner Slaght, which represents the media organizations in the lawsuit, in a statement to the NYT.

The Canadian publishers could find some of their claims easier to prove than others, with copyright infringement being the toughest, according to Lisa Macklem, a lecturer King’s University College at Western University in Ontario, who is an expert in copyright and media law. -NYT

"While it seems obvious that OpenAI is infringing copyright, it is technically very difficult to prove, and this underscores the immediate and pressing need to have regulations put in place, demanding, at the very least, transparency on what is in the training data of generative AI," said Macklem.

OpenAI's problems don't end there. Over the summer, Elon Musk - who co-founded OpenAI in 2015 but left in 2018 under bad circumstances, sued OpenAI, claiming that two of its founders, Sam Altman and Greg Brockman, breached the company's founding contract by putting commercial interests ahead of the public good.

“这是非法的”：加拿大媒体起诉 OpenAI“抓取大量内容”来训练聊天机器人 "It's Illegal": Canadian Media Sues OpenAI For 'Scraping Large Swaths Of Content' To Train Chatbot

“这是非法的”：加拿大媒体起诉 OpenAI“抓取大量内容”来训练聊天机器人
"It's Illegal": Canadian Media Sues OpenAI For 'Scraping Large Swaths Of Content' To Train Chatbot