它依然无法胜任我的工作:四年来的目标漂移(2022–2026)
It Still Can't Do My Job: Four Years of Moving Goalposts (2022–2026)

原始链接: https://publicznyprofil.github.io/ai_cant_do_your_work/

这段叙述追溯了人工智能辅助编程从 2022 年末到 2033 年的快速演变。故事从 ChatGPT 早期的“幻觉”以及开发者社区的怀疑态度开始,记录了这项技术如何从无法完成基础任务,进化到生成谷歌 25% 的代码,并实现“直觉编程”(vibe coding)项目。 在此期间,批评者不断“改变目标”——先是将 AI 工具斥为噱头,继而说是预设演示,最后称其为玩具,尽管有证据表明 AI 的实用性在不断增强。虽然研究表明 AI 有时会降低开发速度并引入安全隐患,但这些局限性并未阻碍其发展。 作者捕捉到了双重现实:AI 既有缺陷,又具有变革性。随着这些工具从简单的自动补全演变为能够管理遗留系统和整个公司的自主智能体,人类的反应却始终停留在贬低其威胁上。总结指出,即使 AI 完成了之前被认为不可能的任务,最后的防线可能依然不变:“那不过是高级一点的自动补全罢了。”

这篇 Hacker News 帖子讨论了当前人工智能(AI)炒作与专业领域现实之间的差距。用户认为,尽管 AI 正成为一种强大的工具,但高管和投资者关于通用人工智能(AGI)以及白领工作即将被淘汰的言论,始终超出了实际的技术进步。 许多不满集中在“移动的球门柱”现象上——即每当 AI 在一项新任务上表现出色时,AGI 的定义就会发生变化。评论者指出,这些夸大的言论更多是服务于企业议程,而非实际效用。除了技术层面的辩论,参与者还对自动化的经济影响表示深切担忧。许多人认为,对 AI 能力的关注忽视了其对劳动者的负面影响,也缺乏针对未来劳动力可能被取代情况的社会或经济路线图。归根结底,讨论反映了一种共同的心声:技术在不断演进,但围绕它的承诺过于夸大,而对普通人而言,其长期的社会后果仍未得到令人不安的忽视。
相关文章

原文

I started keeping notes in December 2022, mostly to document why the panic was overblown. The notes turned into this. The quotes in orange boxes are real. You can look them up. The gray comments are paraphrased from a few thousand comment sections. You know the ones. You may have written some. I did.

November 2022

The party trick

ChatGPT launches on a Wednesday. By the weekend it has a million users and my whole feed is screenshots of it apologizing for code that doesn't compile. It invents functions. It hallucinates whole APIs. I asked it for Snake, the game you write in an afternoon as a teenager. It gave me a snake that ate itself on move one. Five days in, Stack Overflow bans it:

"Because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site."

Stack Overflow temporary policy, December 5, 2022

The verdict was easy, and it was also mine: a stochastic parrot that learned to sound like a senior dev without ever meeting a compiler.

The goalpost

Call me when it stops making things up. It can't even do Snake.

March 2023

The exam season

GPT-4 ships. One prompt now gets you a working Snake. The same game it face-planted on four months earlier. The comment sections adjust instantly and never slow down:

Meanwhile the party trick starts passing exams. OpenAI claims the bar exam at the 90th percentile. Microsoft researchers publish a paper called "Sparks of Artificial General Intelligence". A real paper, with that real title. To be fair, the skeptics landed punches here. A later re-evaluation put the bar exam closer to the 60th percentile, and around the 48th among people who actually passed. Both sides were flinging numbers. Only one side was flinging them at a thing that kept improving.

The goalpost

Toy scripts and exams aren't engineering. Call me when it builds something real. A proper game, say. In 3D.

March 2024

The staged demo

A startup called Cognition announces Devin, "the first AI software engineer". The demo video is everywhere for a week. A month later a veteran developer named Carl Brown (YouTube channel: Internet of Bugs) goes through it almost frame by frame. The impressive parts were curated. Devin didn't do the Upwork task from the demo. It generated its own errors, then heroically fixed them. The skeptics take a well-earned victory lap. I watched the takedown twice. It felt great.

That same spring, the CEO of Nvidia stands on a stage in Dubai:

"It is our job to create computing technologies that nobody has to program, and that the programming language is human. Everybody in the world is now a programmer."

Jensen Huang, World Governments Summit, February 2024

Nobody I know quit programming that year. But everybody I know quietly installed Copilot.

The goalpost

Demos are staged. Call me when real developers use this for real work, daily.

October 2024

The earnings call

"More than a quarter of all new code at Google is generated by AI, then reviewed and accepted by engineers."

Sundar Pichai, Alphabet earnings call, October 2024

The comment sections don't blink. That's just autocomplete acceptance metrics. Boilerplate doesn't count. Half of it is import statements. And fine, some of it probably is. But "a quarter of Google" is a strange thing to keep calling a party trick.

The goalpost

Generating lines isn't the job. Call me when it takes a ticket and ships the feature.

February 2025

The vibes

"There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists."

Andrej Karpathy, February 2, 2025

Three weeks later Pieter Levels prompts a multiplayer 3D flight simulator into existence. It takes him about three hours. He has zero gamedev experience. He puts it online at fly.pieter.com. Remember the 2023 goalpost? A proper game, in 3D? Here it is. It sells $29.99 fighter jets and blimp ads to real customers, and he claims a $1M annual run rate within seventeen days. The comment sections know exactly what to do:

Same season: Zuckerberg tells Joe Rogan that Meta expects AI that codes like a "midlevel engineer" within the year. Dario Amodei says AI may be writing 90 percent of code within six months. And vibe coding grows its own disaster genre. Leaked API keys. Wide-open databases. "My app got hacked and I don't know where to look" postmortems. The seniors are unimpressed, and they have receipts. The slop is real. The security holes are very real.

The goalpost

Toys and prototypes, sure. Call me when it touches production and survives.

July 2025

The month the skeptics were right

A research group called METR takes sixteen experienced open-source developers, gives them AI tools on their own mature repos, and measures. The developers are 19 percent slower with AI. They believed they'd been 20 percent faster. Even after seeing the clock. The comment sections feast, and they've earned it. Best day the skeptics had since Devin.

Same month: OpenAI and Google DeepMind both hit gold at the International Math Olympiad. Five problems out of six, solved in plain language, inside the human time limit. Both things are true at once. That's the part nobody wants to sit with.

The goalpost

For one month, nobody had to move anything.

July 2026

Now

Agents run for hours unattended. They open pull requests. The pull requests get merged. Some of you reviewed one this week without noticing. Stack Overflow's question volume is back to where it was when I learned to code. Not because the questions got answered. Because nobody asks a forum anymore.

Maybe the current goalposts hold. I'd just point out that every entry above held too. For about eighteen months each.

The goalpost

Call me when it handles our legacy codebase. When it can be held accountable. When it knows what to build, not just how.

~2027 (forecast)

The one-shot game, for real this time

One prompt returns a polished, playable open-world game. Coherent art direction. Tuned physics. Working multiplayer. A soundtrack. Not a floaty tech demo. Something your kid plays for a month.

The goalpost

Remixing isn't creating. Call me when it makes something genuinely new.

~2028 (forecast)

The legacy codebase

An agent digests a fifteen-year-old monolith. The one with the cron job held together by a comment that says "do not remove". It maps the undocumented business rules and refactors the whole thing over a quarter, tests green the whole way. The big goalpost falls quietly on a Tuesday.

The goalpost

Call me when it owns a system end to end. Pager and all.

~2030 (forecast)

The pager

The on-call rotation is a model. Incidents open, get diagnosed, get fixed, and get post-mortemed before any human wakes up. Uptime improves. The people this replaced point out, correctly, that keeping systems alive was never the hard part. By now that's a lot of us.

The goalpost

Call me when it comes up with the idea.

~2033 (forecast)

The founder

An AI notices an unmet need, builds the product, finds the customers, and runs the company to a billion-dollar valuation with zero employees. The final think-piece comes out that same week. The argument is airtight: it's still just fancy autocomplete.

联系我们 contact @ memedata.com