为什么人工智能如此愚蠢的线索

为什么人工智能如此愚蠢的线索
A Clue As To Why AI Is So Dumb

原始链接: https://www.zerohedge.com/technology/clue-why-ai-so-dumb

近日，《纽约时报》对科技巨头 OpenAI 和微软提起诉讼，指控其利用其文章训练 ChatGPT 的语言模型，侵犯了其版权。这个问题背后的原因在于AI对深层文学概念的理解能力较差，很大程度上是由于ChatGPT在训练过程中使用了《纽约时报》等出版物。虽然该算法在生成新材料时显得很出色，但经过仔细检查，它缺乏真正的人类作家所表现出的复杂性和细微差别。这种现象的一个主要后果是在讨论政治或社会学话题时出现，聊天机器人给出的表面上的智能反应仅仅是众所周知的流行观点的反省，而不是提供独特的观点。相反，ChatGPT 擅长于表面水平的能力，例如创作歌词、制作艺术品或撰写简短的文章，尽管缺乏任何重要的深度，这使得它们模仿合法思维的尝试变得可笑。尽管人工智能在各个领域提供了不可否认的优势，但目前的情况清楚地提醒我们这一尖端系统固有的局限性和缺点。此外，围绕人工智能的版权法有效性的争论提出了严峻的挑战。虽然学术研究领域内存在合理使用例外，但它们对公然违反这些参数的行为提供了足够的保护。例如，版权问题对世界各地的图书馆、档案馆、研究机构和教育机构产生了重大影响，常常使读者陷入财务危险。围绕知识产权法的现有监管结构起源于安妮女王统治下的古英格兰的一种审查制度；现代迭代通常服务于强大的出版集团的利益，而不是创作者本身。尽管存在这些问题，但仍不确定版权是否构成保护专有材料的可接受手段，特别是考虑到其寿命不断延长，这可能会限制知识传播的自由，同时限制对大量存档资源的访问。最终，人们必须思考，通过延长版权期限所获得的感知利益是否会牺牲我们利用定义当代社会的批判性分析和解释的集体能力。

原文

Authored by Jeffrey Tucker via The Epoch Times,

The New York Times has dropped a major lawsuit against OpenAI and Microsoft for copyright infringement. The paper claims that these companies have been scraping NY Times content to train ChatGPT and other features of artificial intelligence software. They cite real injury here: People are using AI tools for information rather than subscribing to The NY Times, and therefore The NY Times is losing advertiser revenue.

My first reaction is: That explains so much!

In particular, it shows why on any topic regarding politics, news, public health, climate change, or anything even mildly controversial, ChatGPT comes across so stupidly conventional and ignorant of deeper literature. It is like reading The New York Times precisely because the AI engine is using The New York Times as its trainer! That truly does account for the core of the problem.

True, there are thousands of fun things you can do with AI. You can debug software. It can compose nice music and paint pretty pictures. It can slice and dice videos with nice results for TikTok. It can write an instant poem or lyrics of anything. It can instantly bang out an article on any topic. In every case, the results are delightful and very impressive.

And yet in every case, the results are obviously generated by a machine. Once you learn to recognize the telltale signs, it is unmistakable. And then the whole experience becomes boring and unimpressive.

People ask me if I as a writer felt threatened by this machine learning and instant prose generator.

For me, it is quite the opposite. Good writing and good thought comes from a spark that only the human mind can generate. No matter how sophisticated AI gets, it can never reproduce this. In fact, I find it hilariously amusing how bad this software really is.

For example, just now, I asked AI to compose an essay of 350 words on AI and copyright in the style of Jeffrey Tucker. It generated some of the most mind-numbing blather I’ve read in years, saying almost nothing of any significance but saying it in clean English prose that has the feel of authenticity while being barren of any of the reality.

The final paragraph of the result: “Ultimately, the intersection of AI and copyright necessitates thoughtful reflection, interdisciplinary collaboration, and an adaptive legal framework. As technological progress propels us into uncharted terrain, striking the right chord between attributing human agency and embracing the transformative power of AI holds the key to a harmonious coexistence in the realm of digital creativity.”

Eye roll! If I read that anywhere, my spidey sense would be immediately triggered that the author is just making stuff up. More precisely, it is not making stuff up but merely regurgitating known forms of conventional prose in a way that mimics thought but without the slightest spark of any creativity, much less depth of meaning. In other words, AI writes like a highly precocious 5-year-old, capable of astonishing feats of imitation but utterly incapable of actual intelligence. It’s like a sophisticated parrot: seeming to speak English but not really doing so. It’s great for parties but not much else.

Consider the copyright case alone. The New York Times claims to own its words and sentences and is furious that ChatGPT takes it verbatim, allowing people to gain access to ideas without having paid for them. If this is true, The NYT should have a major beef with the whole of corporate media and academia too, since it long ago set out to be the standard-bearer of approved thought and conventional wisdom. AI is merely amplifying.

I have no clue how the courts are going to come down on this question. Regardless, the implications of this case are rather broad. OpenAI and Microsoft admit that they have been using The NY Times for its services but say that this constitutes fair use in the law.

Truth is that the phrase “fair use” does not have a rigorously strict definition. It is what the courts say it is. It’s an exception to the rules concerning copyright that bows to the reality that information is not containable like real property. Without fair use, we would live in a preposterous world in which everyone would be required to forget what he learned by reading anything. So maybe it is fair use and maybe it is not.

A larger problem is the institution of copyright itself. Today it is based on the intuition that a creator should own his work. It did not start out that way, however. That was the whole point of the original Statute of Anne (1709). It amounted to a royal grant of monopoly privilege for publishers and authors, and it was deployed mostly for purposes of censoring dissident political and religious opinions. It also set off centuries of litigation in the commonwealth countries and in the United States.

The practical import of copyright today has very little to do with authors’ rights and mainly centers on the rights of publishers to retain exclusive printing and distribution rights to works. Over the years, the term has been extended, from 28 years to 70 years after the lifetime of the author.

That’s how long publishers retain rights. In the old days, publishers would let books go out of print and the rights would revert to the author. No more. Now publishers keep catalogs for the whole term, resulting in an odd situation in which the author loses all intellectual rights and only his grandchildren are in a position to reprint.

It’s nuts, but that’s how the law works. There are hundreds of thousands of books published after 1930 that are still in copyright and have not been digitized. They are inaccessible for all practical purposes in today’s world. And yet they pay no royalties and even the rights holders have forgotten about them. This is a giant tragedy.

The whole theory of copyright is wrong. It is based on the model of private property, as in real things. Real property is ownership exclusive. If I have a fish, you cannot have the same fish. If I have a boat, you cannot have it too at the same time. That’s why the social norm of property came about in the first place: to allocate the rights of control over things that are scarce. It is designed to prevent conflict and bring peace.

But ideas once created are not scarce. You can take every idea in this article and it takes nothing from me. Ideas are infinitely reproducible and therefore not like property at all. The attempt to make them into property requires state action and ends up creating industrial monopolies that benefit not authors but publishers. When authors get paid, it is called getting “royalties,” as in a stream of money from a royal grant of privilege. There is nothing wrong with getting paid based on sales but that can and does happen without copyright.

For example, you cannot copyright recipes, but services that provide recipes for cooking are a highly lucrative business. You cannot copyright sports strategies and plays, but there is a huge demand for books on them. Same with chess moves. It was true with music until the 1880s in Germany: Bach, Beethoven, and Brahms composed without copyright by simply selling publishers access to their works. This did not diminish output but arguably made it better by ensuring a highly competitive marketplace.

In the early days, you could not copyright computer code either. That’s how it came to be that spreadsheet technology became so dominant so quickly and transformed business life. Only later did copyright come along. Now any developer will tell you that the entire industry is gummed up by intellectual property claims. That’s true of many industries today. Hardly anyone is truly happy with the regime as it exists, except perhaps Disney, which has long lobbied for longer terms.

In any case, ChatGPT is doing nothing morally wrong by scraping The New York Times for content. I happen to think this is a bad business idea because The NY Times is a known propaganda sheet and far from definitive on any topic. But that is the choice that OpenAI (wrongly named because they are taking recourse to intellectual property too) has decided to make. I hope the courts side with OpenAI, but that would be only a temporary fix to a much larger problem of the institution of copyright itself.

In conclusion, the intersection of AI and copyright necessitates thoughtful reflection, interdisciplinary collaboration, and an adaptive legal framework. Just kidding!

为什么人工智能如此愚蠢的线索 A Clue As To Why AI Is So Dumb

为什么人工智能如此愚蠢的线索
A Clue As To Why AI Is So Dumb