使用 Docker Compose 实现零停机部署—

使用 Docker Compose 实现零停机部署——无需 Kubernetes
Zero-Downtime Deployments with Docker Compose – No Kubernetes Required

原始链接: https://statusdude.com/blog/zero-downtime-docker-compose

许多开发者错误地认为 Kubernetes 是生产服务的必需品。在 StatusDude，我们通过更简单的技术栈实现了零停机部署：**Docker Compose 和 HAProxy**。团队最初尝试使用 Traefik，但遇到了致命缺陷：Traefik 在处理滚动部署时表现吃力，路由标签管理混乱，且缺乏智能重发失败请求的能力。在服务关闭期间，Traefik 仍会向即将停止的容器发送流量，导致请求丢失，而其重试中间件也无法解决此问题。相比之下，HAProxy 提供了三层稳健的健康检测机制： 1. **单次请求重试**：如果请求失败，HAProxy 会自动将其重新分发给健康的后端。 2. **被动观察**：监控实时流量，在后端出现三次错误后将其从轮询中剔除。 3. **主动健康检查**：通过探测接口即时发现已停止的容器。通过利用 Docker DNS 进行服务发现，并编写简单的脚本来逐一替换容器，StatusDude 无需复杂的服务网格或 Kubernetes，即可维持多区域、高流量的业务运行。这种“无聊”但经受过实战考验的方法，省去了数小时的调试时间，仅用 60 行配置就确保了更新期间零请求丢失。

这篇 Hacker News 讨论聚焦于一篇颇具争议的博文，文中探讨了如何使用 Docker Compose 而非 Kubernetes 实现零宕机部署。社区对此观点分歧严重。批评者认为，这种“自研”编排方式属于“草台班子工程”——即重新构建了一套复杂、脆弱且缺乏文档的系统，导致团队难以维护。他们主张，对于大多数商业应用而言，定制化 Docker 设置所带来的风险和维护负担，最终会超过学习托管版 Kubernetes（如 EKS 或 K3S）所需的初期门槛。相反，支持者认为 Kubernetes 对于小型项目来说往往是大材小用。他们坚持认为，对于单节点或低流量的服务，Docker Compose 能够提供足够的简洁性和可靠性，而无需承担 K8s 带来的“复杂性税”。这场讨论最终凸显了行业心态的转变：尽管 Kubernetes 曾被视为过度复杂，但许多曾经的怀疑者现在倾向于将其作为标准化的基础，以避免在项目扩张时“重复造轮子”。持中立态度的人士普遍认为，虽然简单的替代方案在特定用例中是有效的，但团队应警惕构建最终演变为难以管理的“非我所创症（NIH）”式庞然大物，从而背负技术债务。

原文

There's a mass delusion in the industry that you need Kubernetes to run a serious production service. You don't. At StatusDude, we serve thousands of monitoring checks per minute, run multi-region workers, and deploy multiple times a day — all with Docker Compose and HAProxy. Zero dropped requests. Zero downtime. No etcd to babysit at 3 AM.

But we didn't start with HAProxy. We started with Traefik. That lasted about four hours.

We Tried Traefik First

Traefik is the popular choice for Docker-based setups. It auto-discovers services via Docker labels, has a slick dashboard, and the docs make it look effortless. We set up two backend replicas with Traefik labels, ran a rolling deploy, and watched everything fall apart.

"Service defined multiple times"

Our first deploy strategy was to run a backend_new service alongside the existing backend during the transition. Both had the same Traefik routing labels — same Host rule, same service definition. Makes sense, right? You want both old and new to serve traffic during the cutover.

Traefik disagreed. Its Docker provider treats each Compose service as a separate configuration source. Two services with the same labels? "Service defined multiple times." 404 on every request. No fallback, no merge, just a flat refusal to route anything.

We reworked the approach to use docker compose --scale backend=4 instead of a separate service. That avoided the label conflict. But it uncovered the next problem.

The Scale-Down Race

The rolling deploy strategy: scale up to 4 replicas (2 old + 2 new), then scale back down to 2 (keeping only the new ones). Simple enough.

Except Traefik's internal routing table didn't update fast enough. We'd scale down from 4 to 2, and Traefik would keep routing to containers that were in the process of shutting down. 502s on every other request. The routing state lagged behind Docker's reality by several seconds — long enough to drop a significant chunk of traffic.

We tried adding delays. We tried disconnecting containers from the network before stopping them (so the health check would fail cleanly before removal). We tried passive health checks — added them, then immediately rolled them back because they were too aggressive and caused false positives.

None of it was clean. But the real killer was something else entirely.

The Killer: No Retry on a Different Backend

That's a known issue that devs seem to ignore for a while now... https://github.com/traefik/traefik/issues/2723

Here's the scenario: during a rolling deploy, you stop an old container. docker stop sends SIGTERM. Uvicorn starts its graceful shutdown, but there's a window — requests that are already in-flight, or requests that arrive between the stop signal and Traefik updating its routing table.

When that request hits the dying backend, the connection drops mid-stream. The client gets a raw error — empty response, connection reset, partial body.

We can't have that. When you report your service and heartbeat monitors are up - we need to acknowledge!

Now here's what Traefik does with that failed request: nothing.

Traefik's retry middleware exists, but it retries on the same backend. The one that's dying. The one that will fail again. It doesn't redispatch to a healthy backend. The request is just... lost.

We tried every combination: passive health checks, disconnect-before-stop, retry middleware with different attempts counts. The fundamental problem remained — Traefik couldn't send a failed request to a different server.

That afternoon, we ripped out Traefik and reached for HAProxy.

What You Actually Need

Let's strip it down. What does zero-downtime deployment actually require?

Multiple backend instances — so you can replace one while the other serves traffic
A load balancer that retries on a different backend — so dying containers don't drop requests
A deploy script that replaces instances one at a time — rolling update

That's it. Three things. Let me show you how we do each one.

Step 1: Multiple Replicas with Docker Compose

Docker Compose has a built-in deploy.replicas setting:

# docker-compose.yml

services:
  backend:
    build: ./backend
    deploy:
      replicas: 2
    image: myapp-backend
    expose:
      - "8000"
    env_file: .env
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 5s
      timeout: 5s
      retries: 3
      start_period: 5s
    restart: unless-stopped

That's 2 backend containers running behind a shared Docker DNS name backend. When you resolve backend inside the Docker network, you get both container IPs.

One Dockerfile, one image, two containers. No pod specs, no deployments, no replica sets.

Step 2: HAProxy as the Load Balancer

HAProxy is battle-tested, fast, and the configuration is readable. But the real reason we chose it: option redispatch.

global
    log stdout format raw local0 info
    maxconn 4096

defaults
    mode http
    timeout connect 3s
    timeout client  30s
    timeout server  30s

    # THE key feature: retry failed requests on a DIFFERENT backend
    retries 3
    option redispatch 1
    retry-on conn-failure empty-response response-timeout 502 503 504

resolvers docker_dns
    nameserver dns1 127.0.0.11:53
    resolve_retries 3
    timeout resolve 1s
    timeout retry   1s
    # Re-resolve DNS every 2 seconds
    hold valid 2s

frontend http_in
    bind *:80
    default_backend backends

backend backends
    balance roundrobin
    option httpchk
    http-check send meth GET uri /health
    http-check expect status 200

    default-server inter 1s fall 1 rise 1 check resolvers docker_dns \
        resolve-prefer ipv4 init-addr none \
        observe layer7 error-limit 3 on-error mark-down

    server-template backend 1-10 backend:8000 check

Let's talk about the three things that make this work.

Retry on a Different Backend

This is the feature that Traefik couldn't deliver:

retries 3
option redispatch 1
retry-on conn-failure empty-response response-timeout 502 503 504

When a request fails — connection refused, empty response, 502, 503, 504 — HAProxy retries it. And option redispatch 1 means every retry goes to a different backend. Not the same dying server. A different, healthy one.

Makes sense, right?!

During a rolling deploy, if a request hits a container that's shutting down and gets an empty response, HAProxy silently retries on the other replica. The client never sees the error. No dropped requests. This single feature eliminated every problem we had with Traefik.

Three Layers of Health Detection

We don't rely on a single health check mechanism. There are three independent layers, each catching different failure modes:

Layer 1 — Per-request retry (milliseconds): If a single request fails, retry immediately on a different backend. Catches transient failures during deploys.

Layer 2 — Passive observation (observe layer7): HAProxy watches actual HTTP responses from real traffic. If a backend returns 3 consecutive 5xx errors (error-limit 3), it's pulled from rotation instantly (on-error mark-down). No waiting for any probe cycle.

Layer 3 — Active health checks (inter 1s fall 1 rise 1): Probes the /health endpoint every second. Catches completely dead backends that receive no traffic. One failure = instant DOWN. One success = back in rotation.

Each layer covers a blind spot of the others. Per-request retry handles the single request that hits a dying backend. Passive checks handle backends that start returning errors under load. Active checks handle backends that crash silently with no traffic flowing to them.

DNS-Based Discovery (No Docker Socket)

The server-template backend 1-10 backend:8000 check line is how HAProxy discovers backends. It resolves the Docker DNS name backend using Docker's embedded DNS resolver (127.0.0.11:53) and creates server entries for each IP it finds.

The hold valid 2s means HAProxy re-resolves every 2 seconds. Container dies? Its IP disappears from DNS. New container starts? Its IP appears. HAProxy picks it up automatically.

No Docker socket mount. No label parsing. No dynamic config generation. A static config file that just works. No service mesh. No sidecar. No operator. Srsly.

Step 3: The Rolling Deploy

This is the entire deploy script:

prod-deploy:
	@echo "=== Zero-downtime rolling deploy ==="
	@for cid in $$(docker compose -f docker-compose.prod.yml ps -q backend); do \
		echo "Replacing $$cid..."; \
		docker stop $$cid && docker rm -f $$cid; \
		docker compose -f docker-compose.prod.yml up -d --no-deps --no-recreate --wait backend; \
	done
	@echo "=== Deploy complete ==="

That's it. Let me walk through what happens:

Get the container IDs of all running backend replicas
For each replica, one at a time:
- Stop and remove the container
- HAProxy detects the missing backend within 2 seconds (DNS re-resolution)
- Traffic shifts to the remaining healthy replica
- Start a new container with the updated image
- --wait blocks until Docker's healthcheck passes
- HAProxy discovers the new backend via DNS
- Traffic starts flowing to the new container
Move to the next replica

At every point during the deploy, at least one healthy backend is serving traffic. The --no-recreate flag prevents Docker from touching the replica we haven't replaced yet.

Any requests that hit the dying container during that 2-second DNS window? Retried on the healthy replica automatically. The client never knows.

Our Setup in Numbers

At StatusDude, this setup handles:

Thousands of monitoring checks per minute across 3 regions
Multiple deploys per day with zero dropped requests
Sub-2-second failover when a backend goes down
~60 lines of HAProxy config and a 10-line deploy script

We went from Traefik (404s, 502s, dropped requests, four hours of debugging) to HAProxy (zero dropped requests, first deploy) in one afternoon. Sometimes the boring, battle-tested tool is the right choice. Well, OK, not "sometimes" - quite often ;-)

P.S Nginx would do too, I just felt like getting haproxy up this time :)