Meta盗版了至少101本我的书,以及其他一些作品。
Meta pirated at least 101 of my books, and others

原始链接: https://garymarcus.substack.com/p/meta-pirated-at-least-101-of-my-books

加里·马库斯对亚历克斯·里斯纳在《大西洋月刊》上发表的文章表示强烈愤慨,该文章揭露了Meta涉嫌公然侵犯版权的行为。据报道,Meta为了训练其Llama 3 AI模型,使用了Libgen数据库,该数据库众所周知包含盗版材料。马库斯发现他100多本书籍和文章未经许可就被使用了。他并非孤例,其他作者也报告了类似的发现。 文章暗示Meta知道这些盗版行为,但仍然继续进行,认为获得内容许可“成本过高”且“速度极慢”。内部沟通显示,他们希望避免获得许可,以便维持“合理使用策略”的法律辩护。马库斯暗示马克·扎克伯格可能批准了这种大规模侵权行为。他对自己的知识产权未经许可就被那些容易照搬照抄的AI系统使用表示愤怒。他还想知道其他AI公司是否也采取了类似的做法。

加里·马库斯声称Meta侵犯了他至少101本书的版权,称这些书被用于训练人工智能模型。dfedbeef 指出在美国故意侵权可能面临的巨额罚款。然而,CamperBob2认为,如果训练AI构成合理使用,像马库斯这样的作者将损失大于收益,这可能会损害未来人工智能模型的可及性。CamperBob2主张关注倡导对所得模型的免费访问,并建议Meta并非合适的目标,因为Meta已经发布了模型权重。这场讨论集中在使用受版权保护的材料进行AI训练的复杂法律和伦理意义上,权衡了版权持有者潜在的经济利益与AI可及性和发展带来的更广泛影响。

原文

Alex Reisner’s shocking new article at The Atlantic on Meta’s unbelievable brazen piracy operations is must reading:

The book is mostly about a database called Libgen, known to contain pirated material.

Meta chose to use it, apparently knowing that it contained pirated material, to train the LLM Llama 3.

Thanks to a search tool provided by the Atlantic, I discovered tonight that in this one database alone Meta pirated over 100 of my books and scientific articles.

In every single case, they used them without my permission.

§

Many other authors are discovering the same thing; social media is full of reports like mine. (If you are an author, you can check here. If you ever published under different names, such as with and without a middle initial, you will want to check them all.)

§

The most damning thing? It appears that Meta knew exactly what they are doing, and chose to proceed anyway. One of the wealthiest companies in history decided to rip off authors in cold blood, out of pure greed, per reporting at The Atlantic:

Meta employees spoke with multiple companies about licensing books and research papers, but they weren’t thrilled with their options. This “seems unreasonably expensive,” wrote one research scientist on an internal company chat, in reference to one potential deal, according to court records. A Llama-team senior manager added that this would also be an “incredibly slow” process: “They take like 4+ weeks to deliver data.” In a message found in another legal filing, a director of engineering noted another downside to this approach: “The problem is that people don’t realize that if we license one single book, we won’t be able to lean into fair use strategy,” a reference to a possible legal defense for using copyrighted books to train AI.

§

It would appear that Mark Zuckerberg himself, the third richest person in the world, worth over 200 billion dollars, signed off on this mass theft of intellectual property.

No word yet, so far as I know, on whether other rapacious AI companies did the same.

Share

Gary Marcus is pissed. He worked hard on every one of those books and articles and does not appreciate them being used without permission by AI systems with an established tendency towards regurgitation.

联系我们 contact @ memedata.com