让眼睛消失。
Make the eyes go away

原始链接: https://hexeditreality.com/posts/make-the-eyes-go-away/

## ithreemcp:模型上下文协议 (MCP) 的学习之旅 为了理解 MCP 服务器,作者构建了 “ithreemcp”,一个基于 Go 的服务器,连接 AI/LLM 客户端和 i3 窗口管理器。该项目最初是作为学习工具,用于观察代理-服务器交互并提升代理能力。 ithreemcp 目前提供检索 i3 窗口布局和工作区细节的功能。最初尝试使用本地模型 (qwen3:30b) 执行复杂任务证明具有挑战性,导致依赖 Claude Code 以提高效率。使用 Ollama 客户端 (qwen3-vl:8b) 进行测试显示出令人惊喜的成功——即使没有明确的指令,代理也能理解工具描述并执行命令,例如关闭窗口(“让眼睛消失”)。 这次经历凸显了 MCP 的“提示 (Prompts)”功能对于指导代理使用工具语法的重要性。作者计划进一步探索 MCP 的采样和引导等功能,并继续尝试前沿模型和本地模型,找到一个最佳平衡点,即使用强大的模型来增强本地实验。该项目已在 GitHub 上发布,标志着深入研究 MCP 领域的初步步骤。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 让眼睛消失 (hexeditreality.com) 3点 由 llllm 2小时前 | 隐藏 | 过去 | 收藏 | 讨论 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文

I’ve written before about building agents and calling tools directly, but up till now I’ve largely avoided MCPs. Now, I’ve started a new job, and I needed to better understand MCP servers. I’ve read the spec, but I felt like there was a disconnect between what has actually become popular.

So, as I often approach these types of problems, I decided to build my own MCP server. Then I can point various off-the-shelf agents with MCP support at my server, and observe the interactions. Then by learning from those observations, I can better utilize the capabilities in my own agents.

Even though I’m building this as a learning tool, I wanted it to be genuinely useful. I’ve used the i3 window manager for some time now, and I’ve always known that it’s easily programmable, and that it has a first-class Go library. So my initial concept was to create an MCP server, which exposes the functionality of my running i3 window manager, through natural language.

I started where I usually start — with local models. I threw qwen3:30b at the problem with various larger context configurations, but the results were only rough sketches that wouldn’t compile. I did some minimal manual work to get it compiling and then on-boarded it to Claude Code. I’m back on the Claude Pro plan. I have a lot I need to get done, and right now it’s the best no bullshit toolset with affiliated models.

👤 User

Please familiarize yourself with this project, I’d like to discuss a new feature.


🤖 Claude

I’ll explore this project to understand its structure and purpose.


🤖 Claude

This is ithreemcp, an MCP (Model Context Protocol) server that bridges AI/LLM clients with the i3 tiling window manager.

Current functionality:

  • GetTree - Returns the complete i3 window layout hierarchy
  • GetWorkspaces - Returns details about all workspaces

Tech stack: Go 1.25.3, using the MCP Go SDK and go-i3 bindings

Architecture: Simple bridge pattern - MCP clients communicate via stdin/stdout, and the server translates requests to i3 IPC calls.

The codebase is compact (~112 lines in main.go) and follows standard Go conventions.

What new feature would you like to discuss?


👤 User

I’d like an end-user, with an agent configured to use this MCP server to be able to say something like, “move the firefox window to workspace 7”. Can you propose the tools you think we should add to this MCP server to best allow an agent to accomplish the task?


🤖 Claude

Great use case! Let me think through what an agent would need to accomplish “move the firefox window to workspace 7”.

And away we go…

Twenty-five commits later, split over two evenings, a github repo was born. ithreemcp

As I originally stated, my intent was to learn how clients interact with MCP servers. I wanted to start by seeing if any of the local models could make it work. But I wasn’t exactly sure which agent to use. Rather than try and force it onto one of these coding agents, I found MCP Client for Ollama. It automatically discovers your Ollama models, today I’m using qwen3-vl:8b, and can be pointed to ithreemcp by adjusting the sample configuration we provide:

uvx ollmcp --servers-json ithreemcp.json

ithreemcps

I like to start by giving it something I don’t think it can handle. So, I launched xeyes:

xeyes

And then I told the model:

make the eyes go away

I hadn’t configured any sort of system prompt. Maybe it got a default one from somewhere about being a helpful agent, but it hasn’t been told the context surrounding its tools. From the tool descriptions, it could infer that it is interrogating/controlling an i3 window manager, but is never told that explicitly. And the RunCommand tool takes a syntax that is not provided, it must either remember that from its training or stumble onto something that works.

ithreemcps

It figures out to search the windows for eyes. The search even returns 3 other matches, Sublime Text drafting this text, Firefox previewing the page, and Gimp editing the screenshots. But it correctly figures out those aren’t what I’m talking about.

ithreemcps

It then proceeds to use the correct syntax to kill the right window, and explains that the eyes should be gone.

In fairness, sometimes it gets the i3 run command syntax wrong, and at least once it killed the xterm from which I launched xeyes (not like I had any other important tabs open or anything). But this was just some initial research. I think with a proper system prompt, and some i3 syntax examples it would perform well.

As much as I love the local models, if you need to be productive writing code, I strongly suggest you find a frontier model where you’re comfortable with the privacy options and cost. I concede that for me it is a compromise for the moment. But, don’t misread what I wrote, local models are still a good fit for many tasks, particularly tasks that don’t require a large context window. As you saw in this post, for me the sweet-spot is using frontier models to hack on the models I can run at home.

Building ithreemcp was just the first step in my learning process. MCP has features I glossed over — Prompts, Resources, Sampling, Elicitation. I never really understood why Prompts existed in MCP. But watching qwen3-vl:8b stumble on i3 syntax occasionally, I get it now: an MCP server can ship prompts that teach clients how to use its tools effectively. I look forward to exploring these topics further.

Here is an assorted list of projects and technologies I’m currently exploring, but haven’t had time to write about.

  • Beads - Steve Yegge’s issue tracker for your agent
  • Claude Code Usage Monitor - How I track my Claude usage
  • Langfuse - OTEL observability for your LLM
  • Devstral Small 2 - Size matters, q4_K_M quantization is 15GB, which might leave just enough room for context on my 24GB card
联系我们 contact @ memedata.com