将 Kubernetes 成本估算与 CUR / FOCUS 账单数据进行对账
Reconciling Kubernetes cost estimates with CUR / FOCUS billing data

原始链接: https://github.com/tanrikuluozlem/burn

**Burn** 是一款零配置、无代理(agentless)的命令行工具,专为识别并消除 Kubernetes 集群中的资源浪费而设计。与传统的监控方式不同,它无需集群代理、持久化存储或复杂的配置,安装即可运行。 **核心功能:** * **全面可见性:** 自动追踪 AWS、Azure、GCP 及本地环境中计算、存储、GPU 和负载均衡器的成本。 * **AI 驱动洞察:** 利用自然语言处理技术分析成本,并生成可直接运行的 `kubectl` 指令以进行优化(例如:调整资源规格或切换至竞价实例)。 * **Slack 集成:** 可作为 Slack 原生机器人部署,通过斜杠命令直接获取实时报告、执行成本分析并接收 AI 驱动的优化建议。 * **可操作的情报:** 识别“幽灵”成本,如过度配置的 CPU 请求和低效的 Ingress 负载均衡器。 * **灵活部署:** 可作为独立二进制文件运行,可运行于 Docker 容器中,或通过 Helm 进行持续的定期报告。 无论您是希望优化云账单还是管理本地资源定价,Burn 都能以低门槛、高收益的方式助您重新掌控 Kubernetes 预算。

Hacker News 最新 | 往日 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 将 Kubernetes 成本估算与 CUR / FOCUS 账单数据进行核对 (github.com/tanrikuluozlem) 9 点积分,由 OzlemT 在 1 小时前发布 | 隐藏 | 往日 | 收藏 | 讨论 | 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

CI Release Go Report Card License

Your Kubernetes cluster is burning money. Find out where.

demo

No agent to deploy. No dashboard to maintain. No YAML to configure. Just install and run.

Watch the demo

  • Zero setupbrew install, run one command, get answers. No cluster agent, no persistent storage, no config files.
  • Full cost coverage — Compute, storage, load balancers, and GPU costs with real-time cloud pricing.
  • AI-powered — Ask questions in plain English, get kubectl commands you can copy-paste.
  • Slack-native/burn for instant cost reports. /burn ask "..." for AI analysis.
  • Cloud + on-prem — Works with AWS EKS, Azure AKS, GCP GKE, and on-premise clusters.
  • Spot readiness — Identifies which workloads can safely move to spot instances with real-time discount and interruption rate.
  • Ingress LB detection — Detects load balancers from both Services and Ingress resources, with hostname deduplication.
  • Time-aware--period 7d for weekly averages instead of point-in-time snapshots.
# Homebrew
brew install tanrikuluozlem/burn/burn

# Upgrade
brew upgrade tanrikuluozlem/burn/burn

# Binary
VERSION=$(curl -s https://api.github.com/repos/tanrikuluozlem/burn/releases/latest | grep tag_name | cut -d'"' -f4 | tr -d 'v') && \
curl -L "https://github.com/tanrikuluozlem/burn/releases/latest/download/burn_${VERSION}_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz" | tar xz

# Docker
docker pull ghcr.io/tanrikuluozlem/burn:latest

# Helm
git clone https://github.com/tanrikuluozlem/burn.git
helm install burn ./burn/charts/burn

# Go
go install github.com/tanrikuluozlem/burn/cmd/burn@latest

macOS: If you see a Gatekeeper warning, run: sudo xattr -d com.apple.quarantine $(which burn)

# Cost breakdown (without Prometheus)
burn analyze

# With Prometheus (pass your Prometheus URL)
burn analyze --prometheus http://prometheus:9090

# 7-day average
burn analyze --prometheus http://prometheus:9090 --period 7d

# Drill into a namespace
burn analyze --prometheus http://prometheus:9090 --namespace argocd

# Spot readiness
burn analyze --prometheus http://prometheus:9090 --spot

spot readiness

Real-time spot discount and interruption rate per instance type.

Get cluster-wide or namespace-specific recommendations:

burn analyze --prometheus http://prometheus:9090 --period 7d --ai
burn analyze --prometheus http://prometheus:9090 --namespace app-backend --ai
burn ask --prometheus http://prometheus:9090 "why is argocd so expensive?"

Example: burn analyze --namespace app-backend --period 7d --ai

NAMESPACE: app-backend (3 pods, $17.19/mo)
──────────────────────────────────
POD                      CPU REQ→USED  MEM REQ→USED   COST/MO
app-backend-deploy-0001  200m → <1m    256Mi → 9Mi    $5.73
app-backend-deploy-0002  200m → <1m    256Mi → 9Mi    $5.73
app-backend-deploy-0003  200m → <1m    256Mi → 128Mi  $5.73

RECOMMENDATIONS
───────────────
The app-backend namespace costs $17.19/mo across 3 pods, but CPU efficiency
is critically low at ~0.1% — pods request 200m CPU each while p95 usage
is under 0.31m.

[!!] 1. Rightsize CPU Requests using p95 data
   app-backend-deploy-0001: p95 CPU is 0.22m → recommend 1m (1.5x p95)
   app-backend-deploy-0002: p95 CPU is 0.30m → recommend 1m (1.5x p95)
   app-backend-deploy-0003: p95 MEM is 128Mi (50% eff) — leave as-is
   $ kubectl set resources deployment app-backend -n app-backend \
     --requests=cpu=1m,memory=14Mi --limits=cpu=200m,memory=256Mi

[!!] 2. app-backend-ingress LB ($19.71/mo) costs more than the namespace
   The load balancer alone exceeds the $17.19/mo compute cost.
   If internal-only, switch to ClusterIP to eliminate the LB cost.
   $ kubectl patch svc app-backend-ingress -n app-backend \
     -p '{"spec": {"type": "ClusterIP"}}'

[!] 3. Enable VPA in Recommend Mode
   Prevent over-provisioning from recurring with continuous p95 tracking.
   $ kubectl apply -f vpa-app-backend.yaml

Ask questions in plain English

ask demo

Requires ANTHROPIC_API_KEY environment variable.

Run burn as a Slack bot:

burn serve --port 8080 --prometheus http://prometheus:9090 --period 7d
Command What you get
/burn Full cost report — nodes, namespaces, idle cost, LB, storage
/burn ns argocd Pod-level breakdown for a namespace
/burn ask "what is the single biggest waste?" AI analysis with kubectl commands

Slack AI

  1. Create a Slack App at https://api.slack.com/apps
  2. Add Slash Command: /burn → point to your server URL + /slack
  3. Set SLACK_SIGNING_SECRET and ANTHROPIC_API_KEY environment variables
  4. Expose the server (e.g., ngrok for testing, load balancer for production)

Burn works with on-premise and GPU clusters. Set your own resource rates:

burn analyze \
  --cpu-price 0.05 \
  --ram-price 0.008 \
  --gpu-price 3.00 \
  --storage-price 0.10

Without custom pricing, cloud-equivalent rates are used as defaults.

Kubernetes API → nodes, pods, PVCs, services, ingresses
Prometheus     → actual CPU & memory usage (optional)
Cloud Pricing  → real VM, storage, and GPU prices (AWS, Azure, GCP)
         ↓
    Cost Engine → compute, storage, load balancers, GPU, idle detection
         ↓
    CLI / Slack / AI Recommendations
Priority Source When
1 AWS/Azure pricing API AWS credentials available — real-time, region-aware
2 Embedded pricing DB No credentials — 600+ AWS, 300+ Azure instances, updated weekly
3 Static fallback Unknown instance type — estimates based on instance family

Storage and load balancer costs are fetched from cloud APIs when available, with static fallbacks. Usage-based charges (data processing, LCU) depend on traffic volume and are not included. GPU nodes are detected automatically and priced via ratio-based cost splitting.

git clone https://github.com/tanrikuluozlem/burn.git
helm install burn ./burn/charts/burn \
  --set prometheus.url=http://prometheus:9090 \
  --set schedule="0 9 * * 1-5"

CronJob (daily Slack reports)

apiVersion: batch/v1
kind: CronJob
metadata:
  name: burn-report
spec:
  schedule: "0 9 * * 1-5"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: burn
            image: ghcr.io/tanrikuluozlem/burn:latest
            args:
            - analyze
            - --prometheus
            - http://prometheus-server.monitoring:80
            - --period
            - 7d
            - --ai
            - --slack
            env:
            - name: ANTHROPIC_API_KEY
              valueFrom:
                secretKeyRef:
                  name: burn-secrets
                  key: anthropic-api-key
            - name: SLACK_WEBHOOK_URL
              valueFrom:
                secretKeyRef:
                  name: burn-secrets
                  key: slack-webhook-url
          restartPolicy: OnFailure
Variable Description Required for
ANTHROPIC_API_KEY Claude API key --ai, ask, serve
SLACK_WEBHOOK_URL Slack webhook URL --slack
SLACK_SIGNING_SECRET Slack app signing secret serve
Flag Description
--cpu-price CPU cost per core per hour (on-prem)
--ram-price RAM cost per GiB per hour (on-prem)
--gpu-price GPU cost per unit per hour (on-prem)
--storage-price Storage cost per GiB per month (on-prem)
--spot Show spot instance readiness details

Cloud clusters use real pricing automatically. These flags are for on-premise clusters where pricing is not available from a cloud provider.

make build    # Build binary
make test     # Run tests
make lint     # Run linter

Apache 2.0 — See LICENSE for details.

联系我们 contact @ memedata.com