展示HN：Optio – 在K8s中编排AI编码代理，从工单到PR

展示HN：Optio – 在K8s中编排AI编码代理，从工单到PR
Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR

原始链接: https://github.com/jonwiggins/optio

## Optio：用于AI编码代理的自主CI/CD Optio利用AI编码代理自动化整个软件开发生命周期，将任务转化为无需人工干预的合并拉取请求。用户可以通过Web UI、GitHub Issues或Linear提交任务，Optio会配置一个专用的Kubernetes Pod，运行AI代理（Claude Code或OpenAI Codex），并管理完整的PR生命周期。 **主要功能包括：** 自动化PR创建、CI监控、智能错误处理（在失败、合并冲突或评审反馈时自动恢复代理）、以及审批后自动合并。仪表盘提供对代理活动、成本和管道进度的实时可见性。 Optio的核心差异化在于其**反馈循环**：它不仅*运行*代理，更*推动*PR完成。它提供每个仓库一个Pod的架构，以实现隔离和可扩展性，强大的成本跟踪，以及与GitHub和Linear等常用工具的集成。 Optio构建于现代技术栈之上，包括Kubernetes、Fastify、Next.js、PostgreSQL和Redis，并可通过Helm charts部署。它支持多提供商OAuth认证和可定制的代理配置。

## Optio：AI驱动的代码编排 Optio 是一款新的开源工具，旨在简化AI辅助编码流程。由 jawiggins 创建，它通过在 Kubernetes (K8s) 环境中编排 AI 代理（如 Claude Code 或 Codex）来消除瓶颈，从而完全自动化从问题工单到合并拉取请求的整个过程。 Optio 管理整个生命周期：从问题跟踪器（GitHub Issues、Linear）获取输入，在隔离的 K8s Pod 中使用 git worktrees 执行，持续监控 PR，通过在失败时自动重试实现自愈，最后在完成时进行 squash 合并。一个关键特性是它的反馈循环——失败和评论反馈*回*给 AI 代理，以进行迭代改进。Optio 使用 Fastify、Next.js 和 Postgres 构建，并包含 Helm chart 以方便部署。目前缺乏强大的租户隔离，开发者正在探索禁用外部连接等功能来解决安全问题。

原文

Autonomous CI/CD for AI coding agents.

Optio turns coding tasks into merged pull requests — without human babysitting. Submit a task (manually, from a GitHub Issue, or from Linear), and Optio handles the rest: provisions an isolated environment, runs an AI agent, opens a PR, monitors CI, triggers code review, auto-fixes failures, and merges when everything passes.

The feedback loop is what makes it different. When CI fails, the agent is automatically resumed with the failure context. When a reviewer requests changes, the agent picks up the review comments and pushes a fix. When everything passes, the PR is squash-merged and the issue is closed. You describe the work; Optio drives it to completion.

Dashboard — real-time overview of running agents, pod status, costs, and recent activity

Task detail — live-streamed agent output with pipeline progress, PR tracking, and cost breakdown

You create a task          Optio runs the agent           Optio closes the loop
─────────────────          ──────────────────────         ──────────────────────

  GitHub Issue              Provision repo pod             CI fails?
  Manual task       ──→     Create git worktree    ──→       → Resume agent with failure context
  Linear ticket             Run Claude Code / Codex        Review requests changes?
                            Open a PR                        → Resume agent with feedback
                                                           CI passes + approved?
                                                             → Squash-merge + close issue

Intake — tasks come from the web UI, GitHub Issues (one-click assign), or Linear tickets
Provisioning — Optio finds or creates a Kubernetes pod for the repo, creates a git worktree for isolation
Execution — the AI agent (Claude Code or OpenAI Codex) runs with your configured prompt, model, and settings
PR lifecycle — Optio polls the PR every 30s for CI status, review state, and merge readiness
Feedback loop — CI failures, merge conflicts, and review feedback automatically resume the agent with context
Completion — PR is squash-merged, linked issues are closed, costs are recorded

The core differentiator. Optio doesn't just run an agent and walk away — it monitors the PR and drives it to completion:

Auto-resume on CI failure — agent is re-queued with the names of failed checks
Auto-resume on merge conflicts — agent is told to rebase and force-push
Auto-resume on review feedback — reviewer comments are passed as the resume prompt
Auto-merge — when CI passes and reviews are approved, the PR is squash-merged
Auto-close issues — linked GitHub Issues are closed with a comment linking the merged PR

Pod-per-repo architecture

One long-lived Kubernetes pod per repository. The pod clones the repo once, then stays alive. Each task gets its own git worktree, so multiple tasks run concurrently without interference.

Multi-pod scaling — repos can scale beyond a single pod (maxPodInstances up to 20)
Persistent volumes — installed tools and caches survive pod restarts
Idle cleanup — pods are reclaimed after 10 minutes of inactivity (configurable)
Health monitoring — crashed/OOM-killed pods are auto-detected and restarted
Worktree lifecycle — automatic cleanup with grace periods for retries

Optio can automatically launch a review agent as a subtask of the original coding task:

Triggered on CI pass, on PR open, or manually
Runs with a separate review-specific prompt and model (use a cheaper model for reviews)
Blocking — the parent task waits for the review to complete before merging
Configurable per-repo with custom review prompt templates

Each repository can be individually configured:

Claude model — Opus or Sonnet, with context window (200k or 1M), thinking on/off, effort level
Container image — auto-detected from repo contents (Node, Python, Go, Rust, Full) or custom Dockerfile
Prompt templates — Handlebars-style templates with {{variables}} and {{#if}} conditionals
Concurrency — max concurrent tasks per repo, max pods per repo, max agents per pod
Extra packages and setup commands — apt packages and shell commands run at pod startup
.optio/setup.sh — repo-level setup script run after clone

Priority queue — integer priorities with drag-to-reorder
Bulk operations — retry all failed, cancel all active
Subtask system — child tasks, sequential pipeline steps, and review subtasks
Cost tracking — per-task cost in USD, with analytics dashboard (daily trends, per-repo breakdown, top tasks)
Error classification — failures are categorized (auth, network, timeout, resource, etc.) with human-readable descriptions and suggested remedies

GitHub Issues — browse issues in the UI, one-click assign to Optio, auto-label and comment back with PR links
Linear — fetch actionable tickets, sync status, add comments
Multi-provider OAuth — GitHub, Google, and GitLab authentication for the web UI
Claude Code + OpenAI Codex — API key or Max Subscription (OAuth token) auth

Live log streaming — structured agent output streamed via WebSocket
Pipeline progress — visual stage tracker (queued → setup → running → PR → CI → review → merge → done)
Event timeline — full audit trail of state transitions
Cluster view — pod status, resource usage, health events
Cost analytics — daily cost trends, per-repo and per-type breakdowns, period comparisons

┌──────────────┐     ┌────────────────────┐     ┌──────────────────────────┐
│   Web UI     │────→│    API Server      │────→│      Kubernetes          │
│   Next.js    │     │    Fastify         │     │                          │
│   :3100      │     │                    │     │  ┌── Repo Pod A ──────┐  │
│              │←ws──│  Workers:          │     │  │ clone + sleep      │  │
│  Dashboard   │     │  ├─ Task Queue     │     │  │ ├─ worktree 1  ⚡  │  │
│  Tasks       │     │  ├─ PR Watcher     │     │  │ ├─ worktree 2  ⚡  │  │
│  Repos       │     │  ├─ Health Mon     │     │  │ └─ worktree N  ⚡  │  │
│  Cluster     │     │  └─ Ticket Sync    │     │  └────────────────────┘  │
│  Costs       │     │                    │     │  ┌── Repo Pod B ──────┐  │
│  Issues      │     │  Services:         │     │  │ clone + sleep      │  │
│              │     │  ├─ Repo Pool      │     │  │ └─ worktree 1  ⚡  │  │
│              │     │  ├─ Review Agent   │     │  └────────────────────┘  │
│              │     │  └─ Auth/Secrets   │     │                          │
└──────────────┘     └─────────┬──────────┘     └──────────────────────────┘
                               │                  ⚡ = Claude Code / Codex
                        ┌──────┴──────┐
                        │  Postgres   │  Tasks, logs, events, secrets, repos
                        │  Redis      │  Job queue, pub/sub, live streaming
                        └─────────────┘

  ┌─────────────────────────────────────────────────┐
  │                    INTAKE                        │
  │                                                  │
  │   GitHub Issue ───→ ┌──────────┐                 │
  │   Manual Task ───→  │  QUEUED  │                 │
  │   Ticket Sync ───→  └────┬─────┘                 │
  └───────────────────────────┼──────────────────────┘
                              │
  ┌───────────────────────────┼──────────────────────┐
  │                 EXECUTION ▼                       │
  │                                                   │
  │   ┌──────────────┐    ┌─────────────────┐         │
  │   │ PROVISIONING │───→│     RUNNING     │         │
  │   │ get/create   │    │  agent writes   │         │
  │   │ repo pod     │    │  code in        │         │
  │   └──────────────┘    │  worktree       │         │
  │                        └───────┬─────────┘         │
  └────────────────────────────────┼──────────────────┘
                                   │
                 ┌─────────────┐   │   ┌─────────────────┐
                 │   FAILED    │←──┴──→│   PR OPENED     │
                 │             │       │                  │
                 │ (auto-retry │       │  PR watcher      │
                 │  if stale)  │       │  polls every 30s │
                 └─────────────┘       └────────┬─────────┘
                                                │
  ┌─────────────────────────────────────────────┼────────┐
  │                 FEEDBACK LOOP                │        │
  │                                              │        │
  │   CI fails?  ────────→  Resume agent  ←──────┤        │
  │                          to fix build         │        │
  │                                              │        │
  │   Merge conflicts? ──→  Resume agent  ←──────┤        │
  │                          to rebase            │        │
  │                                              │        │
  │   Review requests ───→  Resume agent  ←──────┤        │
  │   changes?               with feedback        │        │
  │                                              │        │
  │   CI passes + ───────→  Auto-merge    ───────┘        │
  │   review done?           & close issue                │
  │                                                       │
  │                          ┌─────────────┐              │
  │                          │  COMPLETED  │              │
  │                          │  PR merged  │              │
  │                          │  Issue closed│              │
  │                          └─────────────┘              │
  └───────────────────────────────────────────────────────┘

Docker Desktop with Kubernetes enabled (Settings → Kubernetes → Enable)
Node.js 22+ and pnpm 10+

# Clone and install
git clone https://github.com/jonwiggins/optio.git && cd optio
pnpm install

# Bootstrap infrastructure (Postgres + Redis in K8s, migrations, .env)
./scripts/setup-local.sh

# Build the agent image
docker build -t optio-agent:latest -f Dockerfile.agent .

# Start dev servers
pnpm dev
# API → http://localhost:4000
# Web → http://localhost:3100

The setup wizard walks you through configuring GitHub access, agent credentials (API key or Max Subscription), and adding your first repository.

apps/
  api/          Fastify API server, BullMQ workers, WebSocket endpoints,
                review service, subtask system, OAuth providers
  web/          Next.js dashboard with real-time streaming, cost analytics

packages/
  shared/             Types, task state machine, prompt templates, error classifier
  container-runtime/  Kubernetes pod lifecycle, exec, log streaming
  agent-adapters/     Claude Code + Codex prompt/auth adapters
  ticket-providers/   GitHub Issues, Linear

images/               Container Dockerfiles: base, node, python, go, rust, full
helm/optio/           Helm chart for production Kubernetes deployment
scripts/              Setup, init, and entrypoint scripts

Optio ships with a Helm chart for production Kubernetes clusters:

helm install optio helm/optio \
  --set encryption.key=$(openssl rand -hex 32) \
  --set postgresql.enabled=false \
  --set externalDatabase.url="postgres://..." \
  --set redis.enabled=false \
  --set externalRedis.url="redis://..." \
  --set ingress.enabled=true \
  --set ingress.hosts[0].host=optio.example.com

See the Helm chart values for full configuration options including OAuth providers, resource limits, and agent image settings.

Layer	Technology
Monorepo	Turborepo + pnpm
API	Fastify 5, Drizzle ORM, BullMQ
Web	Next.js 15, Tailwind CSS 4, Zustand
Database	PostgreSQL 16
Queue	Redis 7 + BullMQ
Runtime	Kubernetes (Docker Desktop for local dev)
Deploy	Helm chart
Auth	Multi-provider OAuth (GitHub, Google, GitLab)
CI	GitHub Actions (format, typecheck, test, build-web, build-image)
Agents	Claude Code, OpenAI Codex

See CONTRIBUTING.md for development setup, workflow, and conventions.

MIT

展示HN：Optio – 在K8s中编排AI编码代理，从工单到PR Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR

Pod-per-repo architecture

展示HN：Optio – 在K8s中编排AI编码代理，从工单到PR
Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR