企业蜜月期结束了。

企业蜜月期结束了。
GraphQL: The enterprise honeymoon is over

原始链接: https://johnjames.blog/posts/graphql-the-enterprise-honeymoon-is-over

## GraphQL：一次现实评估作者在大型企业环境中使用了多年 GraphQL（配合 Apollo Client/Server），得出结论认为它的优势常常被夸大。虽然 GraphQL 旨在通过允许客户端仅请求所需数据来解决过度获取问题，但这个问题通常已经通过企业架构中常用的后端为前端 (BFF) 层解决。核心问题在于，大多数下游服务仍然基于 REST，这意味着 GraphQL 仍然需要从它们那里过度获取数据，然后再重塑数据——只是将问题转移，而不是消除它。与 REST 相比，GraphQL 的实现要复杂得多，耗时更长，需要模式定义、解析器和持续同步。此外，GraphQL 的默认可观察性较差（200 状态可能表示部分失败），缓存机制脆弱，并且在文件处理和 ID 要求方面引入了不便。虽然功能强大，但这些功能通常会带来巨大的开销。最终，作者认为 GraphQL 优化了数据*消费*，却牺牲了数据*生产*的速度和简洁性。对于已经使用 BFF 的大多数企业来说，这种权衡通常使 GraphQL 成为一个净负面因素，证明它是一个针对特定场景的利基解决方案，而不是 REST 的通用替代品。

原文

By John James

Published on December 14, 2025

Read time ~3 min

I’ve used GraphQL, specifically Apollo Client and Server, for a couple of years in a real enterprise-grade application.

Not a toy app. Not a greenfield startup. A proper production setup with multiple teams, BFFs, downstream services, observability requirements, and real users.

And after all that time, I’ve come to a pretty boring conclusion:

GraphQL solves a real problem, but that problem is far more niche than people admit. In most enterprise setups, it’s already solved elsewhere, and when you add up the tradeoffs, GraphQL often ends up being a net negative.

This isn’t a “GraphQL bad” post. It’s a “GraphQL after the honeymoon” post.

what GraphQL is supposed to solve

The main problem GraphQL tries to solve is overfetching.

The idea is simple and appealing:

the client asks for exactly the fields it needs
no more, no less
no wasted bytes
no backend changes for every new UI requirement

On paper, that’s great.

In practice, things are messier.

overfetching is already solved by BFFs

Most enterprise frontend architectures already have a BFF (Backend for Frontend).

That BFF exists specifically to:

shape data for the UI
aggregate multiple downstream calls
hide backend complexity
return exactly what the UI needs

If you’re using REST behind a BFF, overfetching is already solvable. The BFF can scope down responses and return only what the UI cares about.

Yes, GraphQL can also do this.

But here’s the part people gloss over.

Most downstream services are still REST.

So now your GraphQL layer still has to overfetch from downstream REST APIs, then reshape the response. You didn’t eliminate overfetching. You just moved it down a layer.

That alone significantly diminishes GraphQL’s main selling point.

There is a case where GraphQL wins here. If multiple pages hit the same endpoint but need slightly different fields, GraphQL lets you scope those differences per query.

But let’s be honest about the trade.

You’re usually talking about saving a handful of fields per request, in exchange for:

more setup
more abstraction
more indirection
more code to maintain

That’s a very expensive trade for a few extra kilobytes.

implementation time is much higher than REST

GraphQL takes significantly longer to implement than a REST BFF.

With REST, you typically:

call downstream services
adapt the response
return what the UI needs

With GraphQL, you now have to:

define a schema
define types
define resolvers
define data sources
write adapter functions anyway
keep schema, resolvers, and clients in sync

GraphQL optimizes consumption at the cost of production speed.

In an enterprise environment, production speed matters more than theoretical elegance.

observability is worse by default

This one doesn’t get talked about enough.

GraphQL has this weird status code convention:

400 if the query can’t be parsed
200 with an errors array if something failed during execution
200 if it succeeded or partially succeeded
500 if the server is unreachable

From an observability standpoint, this is painful.

With REST:

2XX means success
4XX means client error
5XX means server error

If you filter dashboards by 2XX, you know those requests succeeded.

With GraphQL, a 200 can still mean partial or full failure.

Yes, Apollo lets you customize this behavior. But that’s kind of the point. You’re constantly paying a tax in extra configuration, extra conventions, and extra mental overhead just to get back to something REST gives you out of the box.

This matters when you’re on call, not when you’re reading blog posts.

caching sounds amazing until you live with it

Apollo’s normalized caching is genuinely impressive.

In theory.

In practice, it’s fragile.

If you have two queries where only one field differs, Apollo treats them as separate queries. You then have to manually wire things so:

existing fields come from cache
only the differing field is fetched

At that point:

you still have a roundtrip
you’ve added more code
debugging cache issues becomes its own problem

Meanwhile, REST happily overfetches a few extra fields, caches the whole response, and moves on.

Extra kilobytes are cheap. Complexity isn’t.

the ID requirement is a leaky abstraction

Apollo expects every object to have an id or _id field by default, or you need to configure a custom identifier.

That assumption does not hold in many enterprise APIs.

Plenty of APIs:

don’t return IDs
don’t have natural unique keys
aren’t modeled as globally identifiable entities

So now the BFF has to generate IDs locally just to satisfy the GraphQL client.

That means:

more logic
more fields
you’re always fetching one extra field anyway

Which is ironic, considering the original goal was to reduce overfetching.

REST clients don’t impose this kind of constraint.

file uploads and downloads are awkward

GraphQL is simply not a good fit for binary data.

In practice, you end up:

returning a download URL
then using REST to fetch the file anyway

Embedding large payloads like PDFs directly in GraphQL responses leads to bloated responses and worse performance.

This alone breaks the “single API” story.

onboarding is slower

Most frontend and full-stack developers are far more experienced with REST than GraphQL.

Introducing GraphQL means:

teaching schemas
teaching resolvers
teaching query composition
teaching caching rules
teaching error semantics

That learning curve creates friction, especially when teams need to move fast.

REST is boring, but boring scales extremely well.

error handling is harder than it needs to be

GraphQL error responses are… weird.

You have:

nullable vs non-nullable fields
partial data
errors arrays
extensions with custom status codes
the need to trace which resolver failed and why

All of this adds indirection.

Compare that to a simple REST setup where:

input validation fails, return a 400
backend fails, return a 500
zod error, done

Simple errors are easier to reason about than elegant ones.

the net result

GraphQL absolutely has valid use cases.

But in most enterprise environments:

you already have BFFs
downstream services are REST
overfetching is not your biggest problem
observability, reliability, and speed matter more

When you add everything up, GraphQL often ends up solving a narrow problem while introducing a broader set of new ones.

That’s why, after using it in production for years, I’d say this:

GraphQL isn’t bad.
It’s just niche.
And you probably don’t need it.

Especially if your architecture already solved the problem it was designed for.