代理间配对编程
Agent-to-Agent Pair Programming

原始链接: https://axeldelafosse.com/blog/agent-to-agent-pair-programming

最近的研究表明,最有效的AI工作流程模仿人类协作,特别是结对编程。研究人员发现,让像Claude和Codex这样的AI模型协同工作——一个作为编码者,另一个作为审查者——会产生出乎意料的强大结果。即使是不同的反馈也很有价值,并且会立即采取行动。 为了促进这一点,作者构建了“loop”,一个简单的CLI工具,它可以并行运行Claude和Codex,并在tmux环境中实现直接通信。这加速了反馈循环,并允许更主动的AI交互,同时仍然允许人类监督。 该实验强调了代理工作流程向“团队合作”转变,而不是纯粹的自动化。虽然前景可观,但关于人工交接和管理增加的变更量仍然存在挑战。作者鼓励多代理应用将代理间的通信作为核心功能,并在GitHub上分享该项目供他人探索。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 Agent-to-Agent 配对编程 (axeldelafosse.com) 4 点赞 by axldelafosse 1 小时前 | 隐藏 | 过去 | 收藏 | 讨论 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

What if you could let Claude and Codex work together as pair programmers, talking to each other directly? One of them as the main worker and the other as a reviewer.

It is amusing how the best agentic workflows often look a lot like human collaboration. Researchers at Cursor discovered this in their work on long-running coding agents. That work led them to create a multi-agent workflow with a main orchestrator assigning tasks to workers. This is similar to how most human teams operate. Claude Code “Agent teams” and Codex “Multi-agent” features work similarly, with subagents reporting back to the main agent. And in the future, subagents could interact with each other, like humans do.

I wanted to pursue the idea of mimicking human collaboration with multiple agent harnesses and another workflow used by programmers: pair programming. While building a code review agent using Claude and Codex side-by-side, I found something interesting: they gave different feedback -- but even when they gave the same feedback, it wasn’t annoying: it was in fact a very strong signal. Our team addresses 100% of the feedback when both reviewers agree. Code reviews are great because they happen on a multiplayer app where humans and agents collaborate, but they are slowing down the feedback loop and can become noisy.

That’s why I built loop: a dead-simple CLI that launches claude and codex side-by-side in tmux, with a bridge that lets them talk to each other. It makes this feedback loop faster and more natural, while preserving context across iterations. It’s interesting because it enables the agents to be more proactive, since the interaction between them is more natural (and I expect that to only get better as the models get better too). Because loop runs the interactive TUIs, you can stay in the loop, steer, answer questions, and follow up if needed.

The future of agentic workflows may look less like magic automation and more like familiar teamwork. And I’m sure that there are some great observations to apply to this pair programming workflow. Some open questions around how to make the human handoff and PR review easier:

  • Should we split the work across multiple PRs?
  • Should we share the PLAN.md in git or in the PR description?
  • Should we share a screenshot or video recording as a proof of work?

Letting the agents loop can result in more changes than expected, which are usually welcome -- but unfortunately it makes the human review harder.

A lot of people are using multiple agent harnesses for a variety of reasons: to avoid vendor lock-in, to use and contribute to an open-source project, to max out their subscriptions, or to get different perspectives, strengths, and results. Multi-agent harness apps should probably treat agent-to-agent communication as a first-class feature. I’d love to see them adopt this approach.

Try it out: https://github.com/axeldelafosse/loop

Thanks to Léna Deloizy Delafosse, Will Horn, Tian Wang and Ferruccio Balestreri for reading drafts of this.

联系我们 contact @ memedata.com