展示HN:Vibium – AI和人类的浏览器自动化,由Selenium的创建者开发
Show HN: Vibium – Browser automation for AI and humans, by Selenium's creator

原始链接: https://github.com/VibiumDev/vibium

## Vibium:为AI打造的轻松浏览器自动化 Vibium 是一款精简的浏览器自动化基础设施,专为AI代理、测试自动化等设计。它将所有内容——浏览器生命周期管理、WebDriver BiDi 协议和 MCP 服务器——打包成一个轻量级(约10MB)的 Go 二进制文件,从而简化了流程。 这意味着开发者可以使用简单的命令 `claude mcp add vibium`,在类似 Claude Code 的应用中实现**零配置**的浏览器控制。Vibium 会自动处理 Chrome 的下载和配置。 开发者可以通过 JavaScript/TypeScript 客户端 (`npm install vibium`) 与 Vibium 交互,提供同步和异步 API,用于执行诸如导航到 URL、查找和点击元素、截取屏幕截图以及输入文本等任务。 主要特性包括自动等待元素出现、平台支持(Linux、macOS、Windows),以及专注于“隐形”的设计——最大程度地减少配置并提高易用性。Vibium 的路线图包括未来对 Python、Java 的支持,以及内存/导航层和视频录制等高级功能。

## Vibium:AI 时代的浏览器自动化 Selenium 的创建者 Hugs 发布了 Vibium,这是一款专为 AI 代理设计的全新浏览器自动化工具。Vibium 底层使用 Go 构建,但可通过简单的 `npm install` 访问,旨在弥合现有浏览器自动化框架(如 Selenium 和 Playwright)与日益增长的 AI 驱动浏览器交互需求之间的差距。 Vibium 优先考虑为 AI 工作流程优化的开发者体验,提供诸如模型调用过程 (MCP) 等功能,以便轻松与 Claude 等模型集成。虽然 Playwright 目前是一个流行的标准,但 Vibium 专注于 AI 代理控制浏览器的未来。 首个 v1 版本现已发布,并已概述了路线图 (V2),包括计划暴露更多的浏览器功能并解决上下文管理挑战。鼓励用户进行实验并分享用例,并已启动新的 Discord 服务器以供社区讨论。该项目强调安全性,建议在虚拟机中进行开发。
相关文章

原文

Browser automation without the drama.

Vibium is browser automation infrastructure built for AI agents. A single binary handles browser lifecycle, WebDriver BiDi protocol, and exposes an MCP server — so Claude Code (or any MCP client) can drive a browser with zero setup. Works great for AI agents, test automation, and anything else that needs a browser.

New here? Getting Started Tutorial — zero to hello world in 5 minutes.


Component Purpose Interface
Clicker Browser automation, BiDi proxy, MCP server CLI / stdio / WebSocket :9515
JS Client Developer-facing API npm package

┌─────────────────────────────────────────────────────────────┐
│                         LLM / Agent                         │
│          (Claude Code, Codex, Gemini, Local Models)         │
└─────────────────────────────────────────────────────────────┘
                      ▲
                      │ MCP Protocol (stdio)
                      ▼
           ┌─────────────────────┐         
           │   Vibium Clicker    │
           │                     │
           │  ┌───────────────┐  │
           │  │  MCP Server   │  │
           │  └───────▲───────┘  │         ┌──────────────────┐
           │          │          │         │                  │
           │  ┌───────▼───────┐  │WebSocket│                  │
           │  │  BiDi Proxy   │  │◄───────►│  Chrome Browser  │
           │  └───────────────┘  │  BiDi   │                  │
           │                     │         │                  │
           └─────────────────────┘         └──────────────────┘
                      ▲
                      │ WebSocket BiDi :9515
                      ▼
┌─────────────────────────────────────────────────────────────┐
│                        JS/TS Client                         │
│                     npm install vibium                      │
│                                                             │
│    ┌─────────────────┐               ┌─────────────────┐    │
│    │ Async API       │               │    Sync API     │    │
│    │ await vibe.go() │               │    vibe.go()    │    │
│    │                 │               │                 │    │
│    └─────────────────┘               └─────────────────┘    │
└─────────────────────────────────────────────────────────────┘

A single Go binary (~10MB) that does everything:

  • Browser Management: Detects/launches Chrome with BiDi enabled
  • BiDi Proxy: WebSocket server that routes commands to browser
  • MCP Server: stdio interface for LLM agents
  • Auto-Wait: Polls for elements before interacting
  • Screenshots: Viewport capture as PNG

Design goal: The binary is invisible. JS developers just npm install vibium and it works.

// Option 1: require (REPL-friendly)
const { browserSync } = require('vibium')

// Option 2: dynamic import (REPL with --experimental-repl-await)
const { browser } = await import('vibium')

// Option 3: static import (in .mjs or .ts files)
import { browser, browserSync } from 'vibium'

Sync API:

const fs = require('fs')
const { browserSync } = require('vibium')

const vibe = browserSync.launch()
vibe.go('https://example.com')

const png = vibe.screenshot()
fs.writeFileSync('screenshot.png', png)

const link = vibe.find('a')
link.click()
vibe.quit()

Async API:

const fs = await import('fs/promises')
const { browser } = await import('vibium')

const vibe = await browser.launch()
await vibe.go('https://example.com')

const png = await vibe.screenshot()
await fs.writeFile('screenshot.png', png)

const link = await vibe.find('a')
await link.click()
await vibe.quit()

One command to add browser control to Claude Code:

claude mcp add vibium -- npx -y vibium

That's it. No manual steps needed. Chrome downloads automatically during setup.

Tool Description
browser_launch Start browser (visible by default)
browser_navigate Go to URL
browser_find Find element by CSS selector
browser_click Click an element
browser_type Type text into an element
browser_screenshot Capture viewport (base64 or save to file with --screenshot-dir)
browser_quit Close browser

This automatically:

  1. Installs the Clicker binary for your platform
  2. Downloads Chrome for Testing + chromedriver to platform cache:
    • Linux: ~/.cache/vibium/
    • macOS: ~/Library/Caches/vibium/
    • Windows: %LOCALAPPDATA%\vibium\

No manual browser setup required.

Skip browser download (if you manage browsers separately):

VIBIUM_SKIP_BROWSER_DOWNLOAD=1 npm install vibium

Platform Architecture Status
Linux x64 ✅ Supported
macOS x64 (Intel) ✅ Supported
macOS arm64 (Apple Silicon) ✅ Supported
Windows x64 ✅ Supported

As a library:

import { browser } from "vibium";

const vibe = await browser.launch();
await vibe.go("https://example.com");
const el = await vibe.find("a");
await el.click();
await vibe.quit();

With Claude Code:

Once installed via claude mcp add, just ask Claude to browse:

"Go to example.com and click the first link"


See CONTRIBUTING.md for development setup and guidelines.


V1 focuses on the core loop: browser control via MCP and JS client.

See V2-ROADMAP.md for planned features:

  • Python and Java clients
  • Cortex (memory/navigation layer)
  • Retina (recording extension)
  • Video recording
  • AI-powered locators


Apache 2.0

联系我们 contact @ memedata.com