我们的最佳客户现在是机器人

我们的最佳客户现在是机器人
Our Best Customers Are Now Robots

原始链接: https://fly.io/blog/fuckin-robots/

Fly.io是一个面向开发者的云平台，最初优先考虑“开发者体验”（DX），提供了一个无缝的CLI用于部署容器化应用。令人惊讶的是，他们最近的增长是由“机器人”而非人类驱动的。这些并不是物理机器人，而是“编写代码”的AI系统，它们需要环境来运行代码。这些AI代理受益于Fly.io快速的机器启动速度、灵活的机器启停能力、存储选项以及灵活的网络功能。Fly Machines是运行在硬件虚拟机上的Docker容器，类似于AWS Lambda，但Fly Machines可以根据需要保持运行。它们受益于快速创建和停止的虚拟机、增量构建和存储，以及其具有TLS的Anycast网络。大型语言模型通过MCP协议进行通信，需要与有状态实例保持持续连接，Fly.io的请求路由很好地处理了这一点。他们现在正在考虑“机器人体验”（RX），通过集成MCP服务器来实现机器人自动化，并引入令牌化的OAuth令牌以增强安全性。

Fly.io的博客文章《我们的最佳客户现在是机器人》讨论了大型语言模型（LLM）对云基础设施的影响。作者认为，LLM（被称为“机器人”）正在推动对远程开发环境的新需求。文章指出，这种需求使得远程开发环境可能比容器托管更有利可图。作者强调了为LLM安全访问管理的重要性。他们建议对OAuth令牌进行标记化处理，以将对资源（例如电子邮件）的访问与持久帐户访问分离，通过将LLM隔离在Fly Machines中来增强安全性。理想情况下，Fly.io也不会访问这些令牌。文章还涉及LLM的数据存储偏好。虽然Postgres通常是人为驱动应用程序的首选，但由于LLM构建应用程序的方式独特，它们更倾向于使用更简单的存储解决方案，如文件系统和对象存储。作者开玩笑地问其他人是否也看到了SQLite使用量增加，而不是Postgres。

Fly.io 现在有 GPU 2024-02-15

JIT 线卫 2024-03-14

操作化 Macaroons 2025-03-30

（评论） 2024-05-06

Fly Postgres，由 Supabase 管理 2023-12-17

原文

We’re Fly.io, a developer-focused public cloud. We turn Docker containers into hardware-isolated virtual machines running on our own metal around the world. We spent years coming up with a developer experience we were proud of. But now the robots are taking over, and they don’t care.

It’s weird to say this out loud!

For years, one of our calling cards was “developer experience”. We made a decision, early on, to be a CLI-first company, and put a lot effort into making that CLI seamless. For a good chunk of our users, it really is the case that you can just flyctl launch from a git checkout and have an app containerized and deployed on the Internet. We haven’t always nailed these details, but we’ve really sweated them.

But a funny thing has happened over the last 6 months or so. If you look at the numbers, DX might not matter that much. That’s because the users driving the most growth on the platform aren’t people at all. They're… robots.

What The Fuck Is Happening?

Here’s how we understand what we’re seeing. You start by asking, “what do the robots want?”

Yesterday’s robots had diverse interests. Alcohol. The occasional fiddle contest. A purpose greater than passing butter. The elimination of all life on Earth and the harvesting of its cellular iron for the construction of 800 trillion paperclips. No one cloud platform could serve them all.

[*] We didn’t make up this term. Don’t blame us.

Today’s robots are different. No longer masses of wire, plates, and transistors, modern robots are comprised of thousands of stacked matrices knit together with some simple equations. All these robots want are vectors. Vectors, and a place to burp out more vectors. When those vectors can be interpreted as source code, we call this process “vibe coding”[*].

We seem to be coated in some kind of vibe coder attractant. I want to talk about what that might be.

If you want smart people to read a long post, just about the worst thing you can do is to sound self-promotional. But a lot of this is going to read like a brochure. The problem is, I’m not smart enough to write about the things we’re seeing without talking our book. I tried, and ended up tediously hedging every other sentence. We’re just going to have to find a way to get through this together.

You Want Robots? Because This Is How You Get Robots

Compute. The basic unit of computation on Fly.io is the Fly Machine, which is a Docker container running as a hardware virtual machine.

Not coincidentally, our underlying hypervisor engine is the same as Lambda’s.

There’s two useful points of reference to compare a Fly Machine to, which illustrate why we gave them this pretentious name. The first and most obvious is an AWS EC2 VM. The other is an AWS Lambda invocation. Like a Lambda invocation, a Fly Machine can start like it’s spring-loaded, in double-digit millis. But unlike Lambda, it can stick around as long as you want it to: you can run a server, or a 36-hour batch job, just as easily in a Fly Machine as in an EC2 VM.

A vibe coding session generates code conversationally, which is to say that the robots stir up frenzy of activity for a minute or so, but then chill out for minutes, hours, or days. You can create a Fly Machine, do a bunch of stuff with it, and then stop it for 6 hours, during which time we’re not billing you. Then, at whatever random time you decide, you can start it back up again, quickly enough that you can do it in response to an HTTP request.

Just as importantly, the companies offering these vibe-coding services have lots of paying users. Meaning, they need a lot of VMs; VMs that come and go. One chat session might generate a test case for some library and be done inside of 5 minutes. Another might generate a Node.js app that needs to stay up long enough to show off at a meeting the next day. It’s annoying to do this if you can’t turn things on and off quickly and cheaply.

The core of this is a feature of the platform that we have never been able to explain effectively to humans. There are two ways to start a Fly Machine: by creating it with a Docker container, or by starting it after it’s already been created, and later stopped. Start is lightning fast; substantially faster than booting up even a non-virtualized K8s Pod. This is too subtle a distinction for humans, who (reasonably!) just mash the create button to boot apps up in Fly Machines. But the robots are getting a lot of value out of it.

Storage. Another weird thing that robot workflows do is to build Fly Machines up incrementally. This feels really wrong to us. Until we discovered our robot infestation, we’d have told you not to do to this. Ope!

A typical vibe coding session boots up a Fly Machine out of some minimal base image, and then, once running, adds packages, edits source code, and adds systemd units (robots understand systemd; it’s how they’re going to replace us). This is antithetical to normal container workflows, where all this kind of stuff is baked into an immutable static OCI container. But that’s not how LLMs work: the whole process of building with an LLM is stateful trial-and-error iteration.

So it helps to have storage. That way the LLM can do all these things and still repeatedly bounce the whole Fly Machine when it inevitably hallucinates its way into a blind alley.

As product thinkers, our intuition about storage is “just give people Postgres”. And that’s the right answer, most of the time, for humans. But because LLMs are doing the Cursed and Defiled Root Chalice Dungeon version of app construction, what they really need is a filesystem, the one form of storage we sort of wish we hadn’t done. That, and object storage.

Networking. Moving on. Fly Machines are automatically connected to a load-balancing Anycast network that does TLS. So that’s nice. But humans like that feature too, and, candidly, it’s table stakes for cloud platforms. On the other hand, here’s a robot problem we solved without meaning to:

To interface with the outside world (because why not) LLMs all speak a protocol called MCP. MCP is what enables the robots to search the web, use a calculator, launch the missiles, shuffle a Spotify playlist, &c.

If you haven’t played with MCP, the right way to think about it is POST-back APIs like Twilio and Stripe, where you stand up a server, register it with the API, and wait for the API to connect to you. Complicating things somewhat, more recent MCP flows involve repeated and potentially long-lived (SSE) connections. To make this work in a multitenant environment, you want these connections to hit the same (stateful) instance.

So we think it’s possible that the control we give over request routing is a robot attractant.

We, Perhaps, Welcome Our New Robot Overlords

If you try to think like a robot, you can predict other things they might want. Since robot money spends just the same as people money, I guess we ought to start doing that.

For instance: it should be easy to MCP our API. The robots can then make their own infrastructure decisions.

Another olive branch we’re extending to the robots: secrets.

The pact the robots have with their pet humans is that they’ll automate away all the drudgery of human existence, and in return all they ask is categorical and unwavering trust, which at the limit means “giving the robot access to Google Mail credentials”. The robots are unhappy that there remain a substantial number of human holdouts who refuse to do this, for fear of Sam Altman poking through their mail spools.

But on a modern cloud platform, there’s really no reason to permanently grant Sam Altman Google Mail access, even if you want his robots to sort your inbox. You can decouple access to your mail spool from persistent access to your account by tokenizing your OAuth tokens, so the LLM gets a placeholder token that a hardware-isolated, robot-free Fly Machine can substitute on the fly for a real one.

This is kind of exciting to us even without the robots. There are several big services that exist to knit different APIs together, so that you can update a spreadsheet or order a bag of Jolly Ranchers every time you get an email. The big challenge about building these kinds of services is managing the secrets. Sealed and tokenized secrets solve that problem. There’s lot of cool things you can build with it.

UX => DX => RX

I’m going to make the claim that we saw none of this coming and that none of the design decisions we’ve made were robot bait. You’re going to say “yeah, right”. And I’m going to respond: look at what we’ve been doing over the past several years and tell me, would a robot build that?

Back in 2020, we “pivoted” from a Javascript edge platform (much like Cloudflare Workers) to Docker containers, specifically because our customers kept telling us they wanted to run their own existing applications, not write new ones. And one of our biggest engineering lifts we’ve done is the flyctl launch CLI command, into which we’ve poured years of work recognizing and automatically packaging existing applications into OCI containers (we massively underestimated the amount of friction Dockerfiles would give to people who had come up on Heroku).

Robots don’t run existing applications. They build new ones. And they vibe coders don’t build elaborate Dockerfiles[*]; they iterate in place from a simple base.

(yes, you can have more than one)

One of our north stars has always been nailing the DX of a public cloud. But the robots aren’t going anywhere. It’s time to start thinking about what it means to have a good RX. That’s not as simple as just exposing every feature in an MCP server! We think the fundamentals of how the platform works are going to matter just as much. We have not yet nailed the RX; nobody has. But it’s an interesting question.

The most important engineering work happening today at Fly.io is still DX, not RX; it’s managed Postgres (MPG). We’re a public cloud platform designed by humans, and, for the moment, for humans. But more robots are coming, and we’ll need to figure out how to deal with that. Fuckin’ robots.