展示 HN:浏览器 Harness – 让 LLM 能够完成任何浏览器任务
Show HN: Browser Harness – Gives LLM freedom to complete any browser task

原始链接: https://github.com/browser-use/browser-harness

## 浏览器 Harness:一个自我提升的网页代理 浏览器 Harness 是一个非常简单(约 592 行 Python 代码)且强大的工具,它使大型语言模型 (LLM) 能够完全自由地自动化浏览器任务。它直接构建在 Chrome DevTools Protocol (CDP) 之上,无需任何框架或预定义的“框架”。 至关重要的是,该代理*自我改进*——如果缺少必要的函数,它会在任务进行中直接将其写入 `helpers.py` 文件。安装涉及将仓库连接到真实的浏览器(说明在 `install.md` 中)。 用户然后可以简单地定义一个任务,代理处理浏览器交互。免费版本提供并发浏览器、代理和验证码解决。该项目通过在任务执行期间自动生成“技能”(特定于站点的自动化知识)来鼓励社区贡献,用户可以将其作为拉取请求提交。这培养了一个持续学习的网页代理。

## 浏览器 Harness:具有无限制浏览器访问权限的 LLM 一个名为 Browser Harness (github.com/browser-use) 的新项目旨在为大型语言模型 (LLM) 提供与浏览器交互的最大自由度。与限制 LLM 的现有框架不同,该 Harness 允许它们自我纠正,甚至根据需要添加新工具,利用其对 Chrome DevTools Protocol (CDP) 的预训练知识。 核心思想是尽量减少预定义函数,而是依赖 LLM 导航浏览器复杂性的能力,例如跨域 iframe 和警告框。这种“基于技能”的方法——依赖 `SKILL.md` 文档和一些基本的 Python 辅助工具——已经使 LLM 能够取得令人印象深刻的成就,例如下 Stockfish 和创造四格消除世界纪录。 虽然存在对安全性和机器人检测的担忧,但开发者强调本地部署选项。该项目通过提供更无限制和上下文感知 LLM 环境,与 Playwright 和 agent-browser 等替代方案区分开来,从而可能解锁更强大和更具适应性的浏览器自动化。安装通过 Claude 代码设置简化。
相关文章

原文

Browser Harness

The simplest, thinnest, self-healing harness that gives LLM complete freedom to complete any browser task. Built directly on CDP.

The agent writes what's missing, mid-task. No framework, no recipes, no rails. One websocket to Chrome, nothing between.

  ● agent: wants to upload a file
  │
  ● helpers.py → upload_file() missing
  │
  ● agent edits the harness and writes it    helpers.py   192 → 199 lines
  │                                                       + upload_file()
  ✓ file uploaded

You will never use the browser again.

Paste into Claude Code or Codex:

Set up https://github.com/browser-use/browser-harness for me.

Read `install.md` first to install and connect this repo to my real browser. Then read `SKILL.md` for normal usage. Always read `helpers.py` because that is where the functions are. When you open a setup or verification tab, activate it so I can see the active browser tab. After it is installed, open this repository in my browser and, if I am logged in to GitHub, ask me whether you should star it for me as a quick demo that the interaction works — only click the star if I say yes. If I am not logged in, just go to browser-use.com.

When this page appears, tick the checkbox so the agent can connect to your browser:

Remote debugging setup

See domain-skills/ for example tasks.

Useful for stealth, sub-agents, or deployment.
Free tier: 3 concurrent browsers, proxies, captcha solving, and more. No card required.

How simple is it? (~592 lines of Python)

  • install.md — first-time install and browser bootstrap
  • SKILL.md — day-to-day usage
  • run.py (~36 lines) — runs plain Python with helpers preloaded
  • helpers.py (~195 lines) — starting tool calls; the agent edits these
  • admin.py + daemon.py (~361 lines) — daemon bootstrap plus the CDP websocket and socket bridge

PRs and improvements welcome. The best way to help: contribute a new domain skill under domain-skills/ for a site or task you use often (LinkedIn outreach, ordering on Amazon, filing expenses, etc.). Each skill teaches the agent the selectors, flows, and edge cases it would otherwise have to rediscover.

  • Skills are written by the harness, not by you. Just run your task with the agent — when it figures something non-obvious out, it files the skill itself (see SKILL.md). Please don't hand-author skill files; agent-generated ones reflect what actually works in the browser.
  • Open a PR with the generated domain-skills/<site>/ folder — small and focused is great.
  • Bug fixes, docs tweaks, and helper improvements are equally welcome.
  • Browse existing skills (github/, linkedin/, amazon/, ...) to see the shape.

If you're not sure where to start, open an issue and we'll point you somewhere useful.


The Bitter Lesson of Agent Harnesses · Web Agents That Actually Learn

联系我们 contact @ memedata.com