Named after the Prague Orloj, an astronomical clock that has coordinated complex mechanisms for over 600 years.
An orchestration runtime for multi-agent AI systems.
Declare your agents, tools, and policies as YAML. Orloj schedules, executes, routes, and governs them so you can run multi-agent systems in production with the same operational rigor you expect from infrastructure.
Status: Orloj is under active development. APIs and resource schemas may change between minor versions before 1.0.
Running AI agents in production today looks a lot like running containers before container orchestration: ad-hoc scripts, no governance, no observability, and no standard way to manage an agent fleet. Orloj provides:
- Agents-as-Code -- declare agents, their models, tools, and constraints in version-controlled YAML manifests.
- DAG-based orchestration -- pipeline, hierarchical, and swarm-loop topologies with fan-out/fan-in support.
- Model routing -- bind agents to OpenAI, Anthropic, Azure OpenAI, Ollama, and other endpoints. Switch providers without changing agent definitions.
- Tool isolation -- execute tools in containers, WASM sandboxes, or process isolation with configurable timeout and retry.
- Governance built in -- policies, roles, and tool permissions enforced at the execution layer. Unauthorized tool calls fail closed.
- Production reliability -- lease-based task ownership, idempotent replay, capped exponential retry with jitter, and dead-letter handling.
- Web console -- built-in UI with topology views, task inspection, and live event streaming.
Download orlojd (server) and orlojctl (CLI) for your platform from GitHub Releases, extract them, and run:
# Start the server with an embedded worker
./orlojd --storage-backend=memory --task-execution-mode=sequential --embedded-workerOpen http://127.0.0.1:8080/ to explore the web console, then apply a starter blueprint. The example manifests live in this repo -- clone it or browse them on GitHub:
# Apply a starter blueprint (pipeline: planner -> research -> writer)
./orlojctl apply -f examples/blueprints/pipeline/
# Check the result
./orlojctl get task bp-pipeline-taskOr build from source (requires Go 1.25+):
go build -o orlojd ./cmd/orlojd
go build -o orlojctl ./cmd/orlojctlWhen you are ready to scale, switch to message-driven mode with distributed workers and Postgres persistence. See the Quickstart guide for details.
┌─────────────────────────────────────────────────────┐
│ Server (orlojd) │
│ │
│ ┌──────────────┐ ┌────────────────┐ │
│ │ API Server │──►│ Resource Store │ │
│ │ (REST) │ │ mem / postgres │ │
│ └──────┬───────┘ └────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ ┌────────────────┐ │
│ │ Services │──►│ Task Scheduler │ │
│ └──────────────┘ └───────┬────────┘ │
└─────────────────────────────┼───────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Workers (orlojworker) │
│ │
│ ┌──────────────┐ ┌───────────────┐ │
│ │ Task Worker │──►│ Model Gateway │ │
│ │ │ └───────────────┘ │
│ │ │──►┌───────────────┐ │
│ │ │ │ Tool Runtime │ │
│ │ │ └───────────────┘ │
│ │ ◄──────┼───┌───────────────┐ │
│ │ │──►│ Message Bus │ │
│ └──────────────┘ └───────────────┘ │
└─────────────────────────────────────────────────────┘
Server (orlojd) -- API server, resource store (in-memory or Postgres), background services, and task scheduler.
Workers (orlojworker) -- claim tasks, execute agent graphs, route model requests, run tools, and handle inter-agent messaging.
Governance -- AgentPolicy, AgentRole, and ToolPermission resources enforced inline during every tool call and model interaction.
Persistence is backed by Postgres (or in-memory for local dev). Message-driven mode uses NATS JetStream for durable agent-to-agent messaging.
Orloj manages 15 resource types, all defined as declarative YAML with apiVersion, kind, metadata, spec, and status fields:
Core
| Resource | Purpose |
|---|---|
| Agent | Unit of work backed by a language model |
| AgentSystem | Directed graph composing multiple agents |
| ModelEndpoint | Connection to a model provider |
| Tool | External capability with isolation and retry |
| Secret | Credential storage |
| Memory | Vector-backed retrieval for agents |
| McpServer | MCP server connection that discovers/syncs MCP tools |
Governance
| Resource | Purpose |
|---|---|
| AgentPolicy | Token, model, and tool constraints |
| AgentRole | Named permission set bound to agents |
| ToolPermission | Required permissions for tool invocation |
| ToolApproval | Approval record for gated tool invocations |
Scheduling & Triggers
| Resource | Purpose |
|---|---|
| Task | Request to execute an AgentSystem |
| TaskSchedule | Cron-based task creation |
| TaskWebhook | Event-triggered task creation |
| Worker | Execution unit with capability declaration |
Browse docs.orloj.dev.
- Getting Started -- install, quickstart
- Concepts -- architecture, agents, tasks, tools, model routing, governance
- Guides -- deploy a pipeline, configure routing, build tools, set up governance
- Deploy & Operate -- local, VPS, Kubernetes, remote CLI access
- Reference -- CLI, API, resource schemas
- Security -- control plane API tokens, secrets, tool isolation
- Examples -- per-kind YAML under
examples/resources/, starterblueprints/, anduse-cases/(in this repo)
Run the full stack (Postgres + server + 2 workers) with Docker Compose:
docker compose up --build -d
docker compose psThe Compose images include the server and workers only. To drive the API from your machine, install orlojctl from GitHub Releases (CLI-only tarball) or build from this repo; see Deploy & Operate.
See CONTRIBUTING.md for development setup and guidelines.
