我知道我们正处于人工智能泡沫中,因为没有人想要我。
I Know We're in an AI Bubble Because Nobody Wants Me

原始链接: https://petewarden.com/2025/11/29/i-know-were-in-an-ai-bubble-because-nobody-wants-me-%f0%9f%98%ad/

受2012年AlexNet的潜力启发,一位前CTO开始探索如何使深度学习具有可扩展性和效率。最初的模型训练取得成功,但随之而来的是一个关键挑战:以可承受的成本在海量数据集上部署它们。现有的基础设施过于昂贵,软件缺乏高效的推理能力,这促使Jetpac框架的创建——它能够在廉价硬件甚至手机上进行大规模处理。 这段经历激发了他对优化的热情,源于他对编程和从系统中榨取最大性能的终身热爱。作者认为,当前的AI投资严重偏向于昂贵的硬件(GPU、数据中心),而忽视了关键的ML基础设施和软件优化。尽管他知道GPU利用率通常很低,并且通过CPU推理和更好的软件可以获得显著的收益,但像他自己的Moonshine这样专注于效率的初创公司却很难获得资金。 他认为这种不平衡源于“信号传递”——公司优先考虑可见的硬件支出以展示实力,而投资者则偏爱像OpenAI这样的大型、易于管理的投资,这类似于互联网泡沫时期对昂贵Sun服务器的依赖,直到谷歌证明了廉价、开源替代方案的力量。他预测未来将向基于CPU、开源AI解决方案转变,这些解决方案将在通用硬件上运行。

最近一篇来自petewarden.com的Hacker News帖子引发了关于潜在“人工智能泡沫”的讨论。原作者认为,公司正在优先*购买*更多人工智能硬件,而不是*优化*他们已有的硬件。 评论者大多同意,认为重点在于总支出,而非高效产出。一位用户澄清道:公司没有优化动力,因为目前的目标仅仅是多花钱购买硬件。 虽然承认持续硬件开发的需求,但评论者认为同时努力提高硬件利用率至关重要,但目前被忽视了。这场讨论凸显了人工智能领域投资可能存在的不平衡——偏向扩张而非效率。
相关文章

原文

I first got into deep learning in 2012, when AlexNet came out. I was CTO of Jetpac, a startup that aimed to provide information about bars, hotels, and restaurants by analyzing public photos, for example finding hipster (and Turk) friendly cafes. The results from the paper were so astonishing I knew AlexNet would be incredibly helpful, so I spent my Christmas holidays heating our house using a gaming rig with two GPUs and the CudaConvNet software, since that was the only way to train my own version of the model.

The results were even better than I’d hoped, but then I faced the problem of how to apply the model across the billions of photos we’d collected. The only GPU instances on Amazon were designed for video streaming and were prohibitively expensive. The CPU support in the Caffe framework was promising, but it was focused on training models, not running them after they’d been trained (aka inference). What I needed was software that would let me run the model at a massive scale on low-cost hardware. That was the original reason I wrote the Jetpac framework, so I could spin up hundreds of cheap EC2 instances to process our huge backlog of images for tens of thousands of dollars instead of millions.

It turned out that the code was small and fast enough to even run on phones, and after Jetpac was acquired by Google I continued in that direction by leading the mobile support for TensorFlow. While I love edge devices, and that’s what I’m known for these days, my real passion is for efficiency. I learned to code in the 80’s demo scene, went on to write PC game engines professionally in the 90’s, and I got addicted to the dopamine rush of optimizing inner loops. There’s nothing quite like having hard constraints, clear requirements, and days to spend solving the puzzle of how to squeeze just a little bit more speed out of a system.

If you’re not a programmer, it might to difficult to imagine what an emotional process optimizing can be. There’s no guarantee that it’s even possible to find a good answer, so the process itself can be endlessly frustrating. The first thrill comes when you see an opening, a possibility that nobody else has spotted. There’s the satisfaction of working hard to chase down the opportunity, and then too often the despair when it turns out not to work. Even then, that means I’ve learned something, and being good at optimization means learning everything you can about the hardware, operating system, the requirements themselves, and studying others’ code in depth. I can never guarantee that I’ll find a solution, but my consolation is always that I have a better understanding of the world than when I started. The deepest satisfaction comes when I do finally find an approach that runs faster, or uses fewer resources. It’s even a social joy, it almost always contributes to a wider solution that the team is working on, making a product better, or even possible in a way it wasn’t before. The best optimizations come from a full stack team that’s able to make tradeoffs all the way from the product manager to the model architects, from hardware to operating system to software.

Anyway, enough rhapsodizing about the joy of coding, what does this have to do with the AI bubble? When I look around, I see hundreds of billions of dollars being spent on hardware – GPUs, data centers, and power stations. What I don’t see are people waving large checks at ML infrastructure engineers like me and my team. It’s been an uphill battle to raise the investment we’ve needed for Moonshine, and I don’t think it’s just because I’m a better coder than I am a salesman. Thankfully we have found investors who believe in our vision, and we’re on track to be cashflow-positive in Q1 2026, but in general I don’t see many startups able to raIse money on the promise of improving AI efficiency.

This makes no sense to me from any rational economic point of view. If you’re a tech company spending billions of dollars a month on GPUs, wouldn’t spending a few hundreds of millions of dollars a year on software optimization be a good bet? We know that GPU utilization is usually below 50%, and in my experience is often much lower for interactive applications where batches are small and memory-bound decoding dominates. We know that motivated engineers like Scott Gray can do better than Nvidia’s libraries on their own GPUs, and from my experience at Jetpac and Google I’m certain there are a lot of opportunities to run inference on much lower cost CPU machines. Even if you don’t care about the cost, the impact AI power usage has on us and the planet should make this a priority.

So, why is this money being spent? As far as I can tell, it’s because of the signaling benefits to the people making the decisions. Startups like OpenAI are motivated to point to the number of GPUs they’re buying as a moat, suggesting that they’ll be the top AI company for years to come because nobody else will be able to catch up with their head start on compute capacity. Hardware projects are also a lot easier to manage than software, they don’t take up so much scarce management attention. Investors are on board because they’ve seen early success turn into long-term dominance before, it’s clear that AI is a world-changing technology so they need to be part of it, and OpenAI and others are happy to absorb billions of dollars of investment, making VCs’ jobs much easier than it would be if they had to allocate across hundreds of smaller companies. Nobody ever got fired for buying IBM, and nobody’s going to get fired for investing in OpenAI.

I’m picking on OpenAI here, but across the industry you can see everyone from Oracle to Microsoft boasting of the amounts of money they’re spending on hardware, and for the same reasons. They get a lot more positive coverage, and a much larger share price boost, from this than they would announcing they’re hiring a thousand engineers to get more value from their existing hardware.

If I’m right, this spending is unsustainable. I was in the tech industry during the dot com boom, and I saw a similar dynamic with Sun workstations. For a couple of years every startup needed to raise millions of dollars just to launch a website, because the only real option was buying expensive Sun servers and closed software. Then Google came along, and proved that using a lot of cheap PCs running open-source software was cheaper and much more scalable. Nvidia these days feels like Sun did then, and so I bet over the next few years there will be a lot of chatbot startups based on cheap PCs with open source models running on CPUs. Of course I made a similar prediction in 2023, and Nvidia’s valuation has quadrupled since then, so don’t look to me for stock tips!

联系我们 contact @ memedata.com