某AI开源工具库在完成730万美元种子轮融资后，一夜之间被归档。

某AI开源工具库在完成730万美元种子轮融资后，一夜之间被归档。
AI OSS tool repo goes archived over night after raising $7.3M Seed

原始链接: https://github.com/tensorzero/tensorzero

TensorZero 是一个开源的生产级 LLMOps 平台，旨在统一整个大语言模型（LLM）的生命周期。它采用 Rust 构建，具备极致的性能（延迟低于 1 毫秒），并通过单一的 OpenAI 兼容 API 提供通往所有主流 LLM 提供商的网关。该平台整合了五大核心支柱： * **网关（Gateway）：** 提供对任何 LLM 提供商的高性能、高可用访问，并内置路由、重试和故障转移功能。 * **可观测性（Observability）：** 对推理过程、成本和反馈进行实时监控。 * **评估（Evaluation）：** 利用启发式方法和 LLM 评判员进行工作流基准测试。 * **优化（Optimization）：** 利用生产数据构建反馈循环，以改进提示词、模型和推理策略。 * **实验（Experimentation）：** 原生支持 A/B 测试和模型版本控制。此外，**TensorZero Autopilot** 充当自动化的 AI 工程师，利用可观测性数据自主优化提示词和模型。TensorZero 专为可扩展性而设计，受到从 AI 初创公司到财富 10 强企业在内的各类组织信赖。它支持私有化部署、渐进式集成，并兼容 OpenTelemetry 等现有工具。无论您是需要简单的 API 路由，还是复杂的代理式 RAG 系统，TensorZero 都能为您提供可靠的基础设施，助力您从容交付稳健的 LLM 应用。

人工智能开源项目 **TensorZero** 已突然归档，其官网显示该软件不再进行维护。尽管最初的头条新闻暗示了新一轮融资，但 Hacker News 上的用户澄清称，该公司是在 2025 年 8 月筹集的 730 万美元种子轮资金。此次毫无预警的突然倒闭引发了人们的猜测，认为其资金消耗过快且未能获得后续融资。社区成员正在讨论这对人工智能领域的广泛影响，特别是风投机构普遍不愿投资被视为“GPT 套壳”的应用层初创公司，转而倾向于基础设施项目。一些评论者对此次关闭提出了批评，质疑该项目最初建立是否主要是为了融资，而非开发长期可持续的产品。

原文

TensorZero Logo

GitHub Trending - #1 Repository Of The Day

TensorZero is an open-source LLMOps platform that unifies:

Gateway: access every LLM provider through a unified API, built for performance (<1ms p99 latency)
Observability: store inferences and feedback in your database, available programmatically or in the UI
Evaluation: benchmark individual inferences or end-to-end workflows using heuristics, LLM judges, etc.
Optimization: collect metrics and human feedback to optimize prompts, models, and inference strategies
Experimentation: ship with confidence with built-in A/B testing, routing, fallbacks, retries, etc.

You can take what you need, adopt incrementally, and complement with other tools. It plays nicely with the OpenAI SDK, OpenTelemetry, and every major LLM provider.

TensorZero is used by companies ranging from frontier AI startups to the Fortune 10 and fuels ~1% of global LLM API spend today.

Website · Docs · Twitter · Slack · Discord

Quick Start (5min) · Deployment Guide · API Reference · Configuration Reference

tensorzero-demo.mp4

Note

🆕 TensorZero Autopilot

TensorZero Autopilot is an automated AI engineer powered by TensorZero that analyzes LLM observability data, sets up evals, optimizes prompts and models, and runs A/B tests.

It dramatically improves the performance of LLM agents across diverse tasks:

Bar chart showing baseline vs. optimized scores across diverse LLM tasks

Learn more →

Integrate with TensorZero once and access every major LLM provider.

Call any LLM (API or self-hosted) through a single unified API
Infer with tool use, structured outputs (JSON), batch, embeddings, multimodal (images, files), caching, etc.
Create prompt templates and schemas to enforce a structured interface between your application and the LLMs
Satisfy extreme throughput and latency needs, thanks to 🦀 Rust: <1ms p99 latency overhead at 10k+ QPS
Ensure high availability with routing, retries, fallbacks, load balancing, granular timeouts, etc.
Track usage and cost and enforce custom rate limits with granular scopes (e.g. tags)
Set up auth for TensorZero to allow clients to access models without sharing provider API keys

Supported Model Providers

Anthropic, AWS Bedrock, AWS SageMaker, Azure, DeepSeek, Fireworks, GCP Vertex AI Anthropic, GCP Vertex AI Gemini, Google AI Studio (Gemini API), Groq, Hyperbolic, Mistral, OpenAI, OpenRouter, SGLang, TGI, Together AI, vLLM, and xAI (Grok).

Need something else? TensorZero also supports any OpenAI-compatible API (e.g. Ollama).

You can use TensorZero with any OpenAI SDK (Python, Node, Go, etc.) or OpenAI-compatible client.

Deploy the TensorZero Gateway (one Docker container).
Update the base_url and model in your OpenAI-compatible client.
Run inference:

from openai import OpenAI

# Point the client to the TensorZero Gateway
client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")

response = client.chat.completions.create(
    # Call any model provider (or TensorZero function)
    model="tensorzero::model_name::anthropic::claude-sonnet-4-6",
    messages=[
        {
            "role": "user",
            "content": "Share a fun fact about TensorZero.",
        }
    ],
)

See Quick Start for more information.

Zoom in to debug individual API calls, or zoom out to monitor metrics across models and prompts over time — all using the open-source TensorZero UI.

Send production metrics and human feedback to easily optimize your prompts, models, and inference strategies — using the UI or programmatically.

Compare prompts, models, and inference strategies using evaluations powered by heuristics and LLM judges.

Ship with confidence with built-in A/B testing, routing, fallbacks, retries, etc.

Build with an open-source stack well-suited for prototypes but designed from the ground up to support the most complex LLM applications and deployments.

Frequently Asked Questions

How is TensorZero different from other LLM frameworks?

TensorZero enables you to optimize complex LLM applications based on production metrics and human feedback.
TensorZero supports the needs of industrial-grade LLM applications: low latency, high throughput, type safety, self-hosted, GitOps, customizability, etc.
TensorZero unifies the entire LLMOps stack, creating compounding benefits. For example, LLM evaluations can be used for fine-tuning models alongside AI judges.

Can I use TensorZero with ___?

Yes. Every major programming language is supported. It plays nicely with the OpenAI SDK, OpenTelemetry, and every major LLM provider.

Is TensorZero production-ready?

Yes. TensorZero is used by companies ranging from frontier AI startups to the Fortune 10 and powers ~1% of the global LLM API spend today.

Here's a case study: Automating Code Changelogs at a Large Bank with LLMs

How much does TensorZero cost?

TensorZero (LLMOps platform) is 100% self-hosted and open-source.

TensorZero Autopilot (automated AI engineer) is a complementary paid product powered by TensorZero.

Who is building TensorZero?

Our technical team includes a former Rust compiler maintainer, machine learning researchers (Stanford, CMU, Oxford, Columbia) with thousands of citations, and the chief product officer of a decacorn startup. We're backed by the same investors as leading open-source projects (e.g. ClickHouse, CockroachDB) and AI labs (e.g. OpenAI, Anthropic). See our $7.3M seed round announcement and coverage from VentureBeat. We're hiring in NYC.

How do I get started?

You can adopt TensorZero incrementally. Our Quick Start goes from a vanilla OpenAI wrapper to a production-ready LLM application with observability and fine-tuning in just 5 minutes.

Start building today. The Quick Start shows it's easy to set up an LLM application with TensorZero.

Questions? Ask us on Slack or Discord.

Using TensorZero at work? Email us at [email protected] to set up a Slack or Teams channel with your team (free).

We are working on a series of complete runnable examples illustrating TensorZero's data & learning flywheel.

Optimizing Data Extraction (NER) with TensorZero

This example shows how to use TensorZero to optimize a data extraction pipeline. We demonstrate techniques like fine-tuning and dynamic in-context learning (DICL). In the end, an optimized GPT-4o Mini model outperforms GPT-4o on this task — at a fraction of the cost and latency — using a small amount of training data.

Agentic RAG — Multi-Hop Question Answering with LLMs

This example shows how to build a multi-hop retrieval agent using TensorZero. The agent iteratively searches Wikipedia to gather information, and decides when it has enough context to answer a complex question.

Writing Haikus to Satisfy a Judge with Hidden Preferences

This example fine-tunes GPT-4o Mini to generate haikus tailored to a specific taste. You'll see TensorZero's "data flywheel in a box" in action: better variants leads to better data, and better data leads to better variants. You'll see progress by fine-tuning the LLM multiple times.

Image Data Extraction — Multimodal (Vision) Fine-tuning

This example shows how to fine-tune multimodal models (VLMs) like GPT-4o to improve their performance on vision-language tasks. Specifically, we'll build a system that categorizes document images (screenshots of computer science research papers).

Improving LLM Chess Ability with Best-of-N Sampling

This example showcases how best-of-N sampling can significantly enhance an LLM's chess-playing abilities by selecting the most promising moves from multiple generated options.

We write about LLM engineering on the TensorZero Blog. Here are some of our favorite posts: