将代理视为一等公民的应用

将代理视为一等公民的应用
Applications where agents are first-class citizens

原始链接: https://every.to/guides/agent-native

## 实现代理的“对等性”以构建强大的应用构建真正强大的AI驱动应用的关键在于确保代理与用户界面具有**对等性**——这意味着它能够实现用户通过UI可以实现的*任何*结果。这并非关于用工具镜像UI按钮，而是关于功能上的等效性。一份**能力地图**，概述代理如何使用原子工具来实现用户操作（例如创建、标记、搜索或删除笔记），至关重要。细粒度是关键；代理需要灵活、可组合的工具，而不是预定义的逻辑。这允许产生**涌现能力**——用户可以请求你未明确构建的操作，例如交叉引用笔记和任务。这种方法将产品开发从预测转变为观察。你不再需要预测功能，而是从用户与代理的交互方式中学习，然后根据涌现的模式进行优化。持续的**提示优化**（在开发者、用户，以及潜在的代理层面）和积累的上下文将进一步提高性能。虽然自我修改的代理正在出现，但安全措施至关重要。

## 黑客新闻讨论：原生代理应用与AI作者一篇关于“原生代理”应用的最新文章（every.to/guides）在黑客新闻上引发了争论，主要集中在AI辅助内容创作的伦理和质量问题上。许多评论员质疑一篇看似由Claude“撰写”的文章的价值，一些人称其为“尴尬”或“AI垃圾”。人们担心作者利用AI专业知识的感知来获利，以及一个旨在通过利用用户记忆来引导流量至Claude.ai的操纵性预填充提示。讨论延伸到设计代理安全代表用户运行的系统的实用性，探讨了诸如细粒度访问控制和快照/备份系统之类的想法。许多评论员对当前“代理”用例表示怀疑，认为它们大多是理论性的或旨在进行欺诈。然而，一些人认为新兴标准如WebMCP是通往由AI代理驱动的更动态和更易访问的Web UI的途径。总体情绪倾向于希望AI *辅助* 人类写作，而不是取代它，理由是AI生成文本通常费力且不自然。

原文

Parity

Imagine a notes app with a beautiful interface for creating, organizing, and tagging notes. A user asks: "Create a note summarizing my meeting and tag it as urgent." If the UI can do it but the agent can't, the agent is stuck.

The fix: Ensure the agent has tools (or combinations of tools) that can accomplish anything the UI can do. This isn't about a one-to-one mapping of UI buttons to tools—it's about achieving the same outcomes.

The discipline: When adding any UI capability, ask: Can the agent achieve this outcome? If not, add the necessary tools or primitives.

A capability map helps:

User Action	How Agent Achieves It
Create a note	`write_file` to notes directory, or `create_note` tool
Tag a note as urgent	`update_file` metadata, or `tag_note` tool
Search notes	`search_files` or `search_notes` tool
Delete a note	`delete_file` or `delete_note` tool

The test: Pick any action a user can take in your UI. Describe it to the agent. Can it accomplish the outcome?

Granularity

The key shift: The agent is pursuing an outcome with judgment, not executing a choreographed sequence. It can encounter unexpected cases, adjust its approach, or ask clarifying questions—the loop continues until the outcome is achieved.

The more atomic your tools, the more flexibly the agent can use them. If you bundle decision logic into tools, you've moved judgment back into code.

Composability

This works for developers and users. You can ship new features by adding prompts. Users can customize behavior by modifying prompts or creating their own.

The constraint: this only works if tools are atomic enough to be composed in ways you didn't anticipate, and if the agent has parity with users. If tools encode too much logic, composition breaks down.

Emergent Capability

Example: "Cross-reference my meeting notes with my task list and tell me what I've committed to but haven't scheduled." You didn't build a commitment tracker, but if the agent can read notes and tasks, it can accomplish this.

This reveals latent demand. Instead of guessing what features users want, you observe what they're asking the agent to do. When patterns emerge, you can optimize them with domain-specific tools or dedicated prompts. But you didn't have to anticipate them—you discovered them.

This changes how you build products. You're not trying to imagine every feature upfront. You're creating a capable foundation and learning from what emerges.

Improvement over time

Accumulated context: The agent maintains state across sessions—what exists, what the user has done, and what worked.

Prompt refinement at multiple levels: developer-level updates, user-level customization, and (advanced) agent-level adjustments based on feedback.

Self-modification (advanced): Agents that edit their own prompts or code require safety rails—approval gates, checkpoints, rollback paths, and health checks.

The mechanisms are still being discovered. Context and prompt refinement are proven; self-modification is emerging.