优化内容以供代理使用
Optimizing Content for Agents

原始链接: https://cra.mr/optimizing-content-for-agents/

这篇帖子驳斥了“LLMs.txt”的想法,认为这是一种为人工智能优化内容的 flawed 尝试,理由是人工智能有能力利用现有的 API 等工具。核心信息是:**像为人类优化内容一样,为代理优化内容。** 关键优化包括内容顺序、大小和深度,认识到代理通常只读取文件的一部分,并且受益于直接呈现的信息。一种实际应用是**内容协商**——通过 `Accept: text/markdown` 标头识别代理。 Sentry 通过向代理提供精简的 markdown 文档来体现这一点,去除了特定于浏览器的元素,并优先考虑链接层级。他们还引导代理使用编程访问方法(MCP、CLI、API),而不是 HTML 用户界面。对于 Warden 等项目,他们提供完整的 markdown 内容用于引导。 作者强调这是一个不断发展的领域,需要随着代理行为的变化不断适应。最终,提供机器可读的内容可以提高代理的功能和效率。

黑客新闻 新的 | 过去的 | 评论 | 提问 | 展示 | 工作 | 提交 登录 优化内容以供代理使用 (cra.mr) 8 分,由 vinhnx 1小时前发布 | 隐藏 | 过去的 | 收藏 | 1 条评论 帮助 ghiculescu 2分钟前 | 下一个 [–] 受到这篇内容的启发... 有人尝试过让他们的 API 文档对代理更易读吗?回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

Just as useless of an idea as LLMs.txt was

It’s all dumb abstractions that AI doesn’t need because AIs are as smart as humans so they can just use what was already there, which is APIs

LLMs.txt is indeed useless, but that’s the only thing correct in this statement. I’m here once again being rage baited to address more brainless takes on social media. This one is about content optimization.

Short and to the point: you should be optimizing content for agents, just as you optimize things for people. How you do that is an ever-evolving subject, but there are some common things we see:

  • order of content
  • content size
  • depth of nodes

Frontier models and the agents built on top of them all behave similarly, with similar constraints and optimizations. For example, one thing they’re known to do, to avoid context bloat, is to only read parts of files. The first N lines, or bytes, or characters. They’re also known to behave very differently when they’re told information exists somewhere vs. having to discover it on their own. Both of those concerns are actually why LLMs.txt was a valuable idea, but it was the wrong implementation.

The implementation today is simple: content negotiation. When a request comes in with Accept: text/markdown, you can confidently assume you have an agent. That’s your hook, and now it’s just up to you how you optimize it. I’m going to be brief and to the point and just give you a few examples of how we do that at Sentry.

#Docs

We’ve put a bunch of time into optimizing our docs for agents, for obvious reasons. The primary optimizations are mostly simple:

  1. Serve true markdown content - massive tokenization savings as well as improved accuracy
  2. Strip out things that only make sense in the context of the browser, especially navigation and JavaScript bits
  3. Optimize various pages to focus more on link hierarchy - our index, for example, is mostly a sitemap, completely different than non-markdown
$ curl -H "Accept: text/markdown" https://docs.sentry.io/

---
title: "Sentry Documentation"
url: https://docs.sentry.io/
---

# Sentry Documentation

Sentry is a developer-first application monitoring platform that helps you identify and fix issues in real-time. It provides error tracking, performance monitoring, session replay, and more across all major platforms and frameworks.

## Key Features

* **Error Monitoring**: Capture and diagnose errors with full stack traces, breadcrumbs, and context
* **Tracing**: Track requests across services to identify performance bottlenecks
* **Session Replay**: Watch real user sessions to understand what led to errors
* **Profiling**: Identify slow functions and optimize application performance
* **Crons**: Monitor scheduled jobs and detect failures
* **Logs**: Collect and analyze application logs in context

...

In our case we actually use MDX to render these, so it involved a handful of parsing changes and overrides to allow certain key pages to render differently. The result: agents fetch pages that are much more actionable.

#Sentry

If a headless bot is fetching the website, the least useful thing you can do is serve it an authentication-required page. In our case we use the opportunity to inform the agent that there are a few programmatic ways it can access the application information (MCP, CLI, API, etc):

$ curl -H "Accept: text/markdown" https://sentry.io

# Sentry

You've hit the web UI. It's HTML meant for humans, not machines.
Here's what you actually want:

## MCP Server (recommended)

The fastest way to give your agent structured access to Sentry.
OAuth-authenticated, HTTP streaming, no HTML parsing required.

```json
{
  "mcpServers": {
    "sentry": {
      "url": "https://mcp.sentry.dev/mcp"
    }
  }
}
```

Docs: https://mcp.sentry.dev

## CLI

Query issues and analyze errors from the terminal.

https://cli.sentry.dev

...

#Warden

For projects like Warden, we actually set it up so the agent can hit the entire content to bootstrap itself:

Help me set up warden.sentry.dev

curl -H "Accept: text/markdown" https://warden.sentry.dev

# Warden

> Agents that review your code. Locally or on every PR.

Warden watches over your code by running **skills** against your changes. Skills are prompts that define what to look for: security vulnerabilities, API design issues, performance problems, or anything else you want consistent coverage on.

Skills follow the [agentskills.io](https://agentskills.io) specification. They're markdown files with a prompt that tells the AI what to look for. You can use community skills, write your own, or combine both.

- Docs: https://warden.sentry.dev
- GitHub: https://github.com/getsentry/warden
- npm: https://www.npmjs.com/package/@sentry/warden

## How It Works

Every time you run Warden, it:

1. Identifies what changed (files, hunks, or entire directories)
2. Matches changes against configured triggers
3. Runs the appropriate skills against matching code
4. Reports findings with severity, location, and optional fixes

Warden works in two contexts:

- **Locally** - Review changes before you push, get instant feedback
- **In CI** - Automatically review pull requests, post findings as comments

## Quick Start

...

#That’s It

It’s simple and it works. You should do it. You should also pay attention to how patterns are changing with agents and update your optimizations as behavior changes.

联系我们 contact @ memedata.com