展示 HN：在 Chrome 中本地运行的设备端浏览器代理 (Qwen)

展示 HN：在 Chrome 中本地运行的设备端浏览器代理 (Qwen)
Show HN: On-device browser agent (Qwen) running locally in Chrome

原始链接: https://github.com/RunanywhereAI/on-device-browser-agent

## 本地浏览器：设备端网页自动化 Local Browser 是一个 Chrome 扩展程序，它能够实现由人工智能驱动的网页自动化，**完全在您的设备上运行**，优先考虑隐私和离线功能。它利用 WebLLM 和 WebGPU 进行本地 LLM 推理，无需云 API 或密钥。该扩展程序使用多代理系统——一个 **Planner（规划者）** 用于战略性任务分解，以及一个 **Navigator（导航者）** 用于战术性动作执行——来浏览、点击、输入和提取网页数据。用户通过 React 弹出窗口输入任务（例如，“在维基百科上搜索‘WebGPU’…”）。 **主要特点：** 完整的隐私性、~1GB 初始模型下载后的离线运行（默认使用 Qwen2.5-1.5B-Instruct，并提供 Phi-3.5-mini 和 Llama-3.2 选项），以及开发过程中的自动重建。 **要求：** Chrome 124+、Node.js 18+、兼容 WebGPU 的 GPU，以及足够的磁盘空间。这是一个概念验证，专注于单个标签页内的基于文本的 DOM 分析，可能难以处理复杂任务或被阻止的页面。

一个由RunanywhereAI（github.com/runanywhereai）开发的Chrome扩展程序，允许用户在本地设备上运行浏览器代理。该代理由阿里巴巴的Qwen模型和Web GPU Liquid LFM提供支持，可以在浏览器内执行任务——例如打开YouTube上的All in Podcast。该项目目前支持移动SDK，并正在开发Web SDK支持。用户对小型3B Qwen模型的性能印象深刻，并对通过Ollama等工具本地托管更大的本地模型（如gpt-oss-20b）的潜力感到好奇。讨论还涉及潜在的安全问题，特别是与基于浏览器的加密货币挖矿类似的恶意使用可能性，可能导致用于运行大型语言模型的分布式僵尸网络。

原文

A Chrome extension that uses WebLLM to run AI-powered web automation entirely on-device. No cloud APIs, no API keys, fully private.

demo.mp4

On-Device AI: Uses WebLLM with WebGPU acceleration for local LLM inference
Multi-Agent System: Planner + Navigator agents for intelligent task execution
Browser Automation: Navigate, click, type, extract data from web pages
Privacy-First: All AI runs locally, no data leaves your device
Offline Support: Works offline after initial model download

Chrome 124+ (required for WebGPU in service workers)
Node.js 18+ and npm
GPU with WebGPU support (most modern GPUs work)

Clone and install dependencies:
```
cd local-browser
npm install
```
Build the extension:
Load in Chrome:
- Open chrome://extensions
- Enable "Developer mode" (top right)
- Click "Load unpacked"
- Select the dist folder from this project
First run:
- Click the extension icon in your toolbar
- The first run will download the AI model (~1GB)
- This is cached for future use

Navigate to any webpage
Click the Local Browser extension icon
Type a task like:
- "Search for 'WebGPU' on Wikipedia and extract the first paragraph"
- "Go to example.com and tell me what's there"
- "Find the search box and search for 'AI news'"
Watch the AI execute the task step by step

This watches for changes and rebuilds automatically.

local-browser/
├── manifest.json           # Chrome extension manifest (MV3)
├── src/
│   ├── background/         # Service worker
│   │   ├── index.ts        # Entry point & message handling
│   │   ├── llm-engine.ts   # WebLLM wrapper
│   │   └── agents/         # AI agent system
│   │       ├── base-agent.ts
│   │       ├── planner-agent.ts
│   │       ├── navigator-agent.ts
│   │       └── executor.ts
│   ├── content/            # Content scripts
│   │   ├── dom-observer.ts # Page state extraction
│   │   └── action-executor.ts
│   ├── popup/              # React popup UI
│   │   ├── App.tsx
│   │   └── components/
│   └── shared/             # Shared types & constants
└── dist/                   # Build output

User enters a task in the popup UI
Planner Agent analyzes the task and creates a high-level strategy
Navigator Agent examines the current page DOM and decides on the next action
Content Script executes the action (click, type, extract, etc.)
Loop continues until task is complete or fails

The extension uses a two-agent architecture inspired by Nanobrowser:

PlannerAgent: Strategic planning, creates step-by-step approach
NavigatorAgent: Tactical execution, chooses specific actions based on page state

Both agents output structured JSON that is parsed and executed.

Default model: Qwen2.5-1.5B-Instruct-q4f16_1-MLC (~1GB)

Alternative models (configured in src/shared/constants.ts):

Phi-3.5-mini-instruct-q4f16_1-MLC (~2GB, better reasoning)
Llama-3.2-1B-Instruct-q4f16_1-MLC (~0.7GB, smaller)

Update Chrome to version 124 or later
Check chrome://gpu to verify WebGPU status
Some GPUs may not support WebGPU

Ensure you have enough disk space (~2GB free)
Check browser console for errors
Try clearing the extension's storage and reloading

Some pages block content scripts (chrome://, extension pages)
Try on a regular webpage like wikipedia.org

Extension not working after Chrome update

Go to chrome://extensions
Click the reload button on the extension

POC Scope: This is a proof-of-concept, not production software
No Vision: Uses text-only DOM analysis (no screenshot understanding)
Single Tab: Only works with the currently active tab
Basic Actions: Supports navigate, click, type, extract, scroll, wait
Model Size: Smaller models may struggle with complex tasks

WebLLM: On-device LLM inference with WebGPU
React: Popup UI
TypeScript: Type-safe development
Vite + CRXJS: Chrome extension bundling
Chrome Extension Manifest V3: Modern extension architecture

This project is inspired by:

Nanobrowser - Multi-agent web automation (MIT License)
WebLLM - In-browser LLM inference (Apache-2.0 License)

Package	License
@mlc-ai/web-llm	Apache-2.0
React	MIT
Vite	MIT
@crxjs/vite-plugin	MIT
TypeScript	Apache-2.0

MIT License - See LICENSE file for details.

展示 HN：在 Chrome 中本地运行的设备端浏览器代理 (Qwen) Show HN: On-device browser agent (Qwen) running locally in Chrome

Extension not working after Chrome update

展示 HN：在 Chrome 中本地运行的设备端浏览器代理 (Qwen)
Show HN: On-device browser agent (Qwen) running locally in Chrome