持续集成的目的是失败。
The purpose of Continuous Integration is to fail

原始链接: https://blog.nix-ci.com/post/2026-02-05_the-purpose-of-ci-is-to-fail

## 持续集成:价值在于失败 持续集成 (CI) 是一种在代码提交后、部署*之前*自动运行检查的过程。虽然看似简单,但它的真正价值并不在于成功“通过”——那只是开销——而在于当存在错误时**“失败”**。 如果没有 CI,错误只会*在*部署后才会暴露,导致回滚和潜在损害。CI 缩短了反馈循环,尽早发现问题并防止问题影响用户。然而,CI 无法捕获*所有*错误,只能捕获一部分,因此尽早检测至关重要。 有趣的是,如果没有错误发生,CI 会增加不必要的摩擦,减慢部署速度而没有提供任何好处。“不稳定的 CI”——重新运行后就能解决的失败——破坏了对系统的信心。 最终,CI 的价值只有在它*阻止*错误部署时才能体现。将“失败”重新定义为积极的结果——一个问题被阻止的信号——可以帮助开发者更好地理解它在维护代码质量和稳定性方面的作用。下一步是优化 CI,使其更早地失败,直接在开发者的机器上。

## Hacker News 讨论:持续集成的目的是失败 最近 Hacker News 的讨论围绕一篇博客文章展开,该文章认为持续集成 (CI) 的*重点*在于**失败**,并且团队常常通过简单地重新运行失败的构建来破坏这一目的,而不是调查根本原因。 许多评论者同意“仅仅重试”是一种有害的做法,尤其是在“不稳定的”测试(那些间歇性地通过和失败的测试)中。这些不稳定性常常掩盖潜在的问题,例如竞争条件或间歇性服务超时,从而延迟了真实问题的检测。 对话强调了 CI 的价值不仅仅在于捕获错误。它能够更快地集成代码,更好地记录,并允许开发人员更有效地构建彼此的工作。 几位用户指出,大多数 CI 失败并非无法修复;通常,它们源于数据库不一致、时序依赖或基础设施问题,这些问题可以通过适当的日志记录和可重复性(例如,记录测试执行顺序)来诊断。 最终,讨论强调了测试——以及 CI 作为一个整体——应该被设计用来*揭示*问题,而不是提供虚假的安全性。
相关文章

原文

February 5, 2026 - 11 min read

CI is only valuable when it fails. When it passes, it's just overhead: the same outcome you'd get without CI.

What is Continuous Integration?

Software development follows a cyclical iterative pattern. Developers make changes, commit them to version control, deploy them to users, and repeat. Continuous integration (CI) sits between committing and deploying, running automated checks for every commit. If the checks pass, we say "CI passed", and the change can be deployed. If the checks fail, we say "CI failed", and the change is blocked from deployment.

Work flows to Commit, which branches to either CI Passes then Deploy, or CI Fails

If you're an experienced developer, you're probably thinking "Duh!". To really understand the purpose of CI, we have to look at what happens with and without CI.

The Feedback Loop

Even though I hear "you could also just not be stupid" a lot, realistically we developers will make mistakes, and even more so the more productive we are.

What happens when we make mistakes? Consequences can range from "the code is now misformatted" to "payments don't work and we are losing millions per hour".

Without CI, our only chance to catch mistakes is after deployment, when users or teammates encounter them. At that point we roll back to a previous version, fix the problem, and try again.

No mistakes A mistake
Work
Commit
Deploy
Error occurs ✗
Error is noticed
Rollback
Without CI

Note that the mistake only becomes apparent after deployment, and could be noticed an arbitrary amount of time (if at all!) after it caused damage. That means that this feedback loop is long, manual, and dangerous.

Catching Problems Early

No checks can notice all mistakes, but they can certainly catch some of them, and as it turns out, that's already valuable.

"Program testing can be used to show the presence of bugs, but never to show their absence!" ― Edsger W. Dijkstra

Indeed, any mistake caught by CI is one less mistake that reaches production.

Let's see what happens when CI fails because of a mistake:

No mistakes A mistake
Work
Commit
CI runs
CI passes CI fails ✓
Deploy
With CI

In this case, the process is interrupted (and restarted) before deployment. This made the feedback loop shorter, more automated, and less dangerous.

Remember: this only helps when CI does in fact catch the mistake. This problem is not fixable, but in the case where CI cannot catch the mistake, the process falls back to the no-CI scenario above.

In practice you'll probably want more rigorous checks than you think, but there is certainly such a thing as "too much CI" as well.

CI as a Safety Net

If we compare the "mistake" cases with and without CI side-by-side, we can see how CI changes the outcome:

Without CI With CI
Work
Commit
Deploy CI runs
Error occurs ✗ CI fails ✓
Error is noticed
Rollback
With a mistake

Here we clearly see the value of CI: it prevents a bad outcome (the error occurring) by catching the mistake early.

Too much CI

If CI is good, then more CI is better, right? No, not quite. To understand why, we have to look at what happens when no mistakes are made:

With CI Without CI
Work
Commit
CI runs Deploy
CI passes
Deploy
Without a mistake

Note that the end result is the same in both cases: the change is deployed successfully. The only difference is that in the "with CI" case, we had to wait for CI to run and pass before we could deploy.

This means that in the "no mistake" case, CI is just an extra step that adds friction and slows us down, without providing any value.

Faulty CI

The whole reason we use CI in the first place is because we expect developers to make mistakes, so we can't then assume that they won't make mistakes in setting up CI itself. Nor can we assume that the developers who built the CI system itself are infallible.

One dreaded and very common situation is when a failing CI run can be made to pass by simply re-running it. We call this flaky CI.

Flaky CI is nasty because it means that a CI failure no longer reliably indicates that a mistake was caught. And it is doubly nasty because it is unfixable (in theory); sometimes machines just explode.

Luckily flakiness can be detected: Whenever a CI run fails, we can re-run it. If it passes the second time, we are sure it was flaky. If it fails the second time, it may have caught a real mistake (but it could also just have been flaky again).

Faulty CI is a Real and Important problem that I enjoy solving, but it is outside the scope of this article.

The value of CI

Here are the four scenarios again:

With CI Without CI
No mistakes A mistake No mistakes A mistake
Work
Commit
CI runs Deploy
CI passes CI fails ✓ Error occurs ✗
Deploy Error is noticed
Rollback
With and without CI, with and without a mistake

Note that in the "no mistake" cases, CI passing or not existing makes no difference to the outcome. The difference is only in the "mistake" cases, where CI failing prevents a bad outcome. This means that the only valuable outcome of CI is when it fails.

What "Failure" Means

It's unfortunate that we use the word "failure" to describe the valuable outcome of CI, because it makes it sound like a bad thing. The colours that are being used to represent CI outcomes are also a bit backwards. This is what it usually looks like:

"Success" with a '✓' icon on a green background, "Failure" with an 'x' on a red background, and "Flaky" with an 'x' on a red background

Even worse: The valuable "Failure" outcome is represented using the same icon and colour as the worst outcome: "Flaky".

Instead, I propose we could use icons like this instead:

"Success" with a '-' icon on a grey background, "Failure" with an exclamation mark on a green background, and "Flaky" with an 'x' on a red background

Or maybe even with a bit more emoji so we definitely know how to feel about each outcome:

"Success" with a yawning emoji on a grey background, "Failure" with a celebration emoji on a green background, and "Flaky" with a skull and crossbones emoji

It's probably too late to make this change, and red meaning "action required" is well established, but I hope this reframing helps you see CI failures in a new light.

Conclusion

CI's value comes from failing, not from passing. Flakiness undermines that value.

In all the diagrams so far "Work", and "Commit" have come before "CI runs". In the next blog post we'll discuss how to optimise that further: CI should fail on your machine first.

联系我们 contact @ memedata.com