展示HN:一个Markdown文件,可以将你的AI代理变成自主研究员。
Show HN: A Markdown file that turns your AI agent into an autonomous researcher

原始链接: https://github.com/krzysztofdudek/ResearcherSkill

这个工具可以将一个AI编码代理转变为一个自主研究者,能够运行数十个实验来优化代码或系统。只需提供一个`researcher.md`文件和代码库,该代理就会设计、执行和分析实验——自动提交成功的更改并撤销失败的更改。 示例展示了延迟降低,成功地用KD树替换了缓慢的邻居搜索,在30多次实验后,p99延迟从142毫秒降低到89毫秒。 这种“自动研究”不仅限于机器学习;它适用于API性能、测试速度、包大小和算法调整等领域。该代理管理一个专门的`.lab/`目录来跟踪实验历史记录,与主git仓库分离,并利用“Yggdrasil”来持久化项目上下文的记忆。它被设计成一个自我改进、不知疲倦的研究者,用于任何可衡量的目标。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 Show HN: 一个 Markdown 文件,可以将你的 AI 代理变成自主研究员 (github.com/krzysztofdudek) 10 分,chrisdudek 发表于 1 小时前 | 隐藏 | 过去 | 收藏 | 1 条评论 帮助 petcat 发表于 4 分钟前 [–] 我们真的需要将每个生成 Markdown 技能的 AI 机器人都发布在这里吗? 可能不需要。回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文
Gemini_Generated_Image_5z6cxx5z6cxx5z6c

One file. Your AI coding agent becomes a scientist.

Drop researcher.md into Claude Code, Cursor, or any agent that reads markdown. It will design experiments, test hypotheses, discard what fails, keep what works — 30+ experiments overnight while you sleep.

What it looks like running

Experiment 7 — Replace O(n^2) neighbor search with KD-tree

Branch: research/reduce-latency · Parent: #5 · Type: real

Hypothesis: KD-tree lookup should reduce p99 from 142ms to under 100ms Changes: swapped brute-force loop in spatial_index.py with scipy.spatial.KDTree Result: p99 = 89ms (was 142ms baseline, 118ms best) — new best Status: keep

Insight: The bottleneck was always the neighbor search, not the scoring. Experiment #3 (caching) was treating the symptom.

# branch metric status description
0 research/reduce-latency 142.00 keep Baseline measurement
1 research/reduce-latency 139.20 discard Add response caching
2 research/reduce-latency 141.50 discard Connection pooling tuning
3 research/reduce-latency 135.80 discard LRU cache on hot path
5 research/reduce-latency 118.00 keep Batch DB queries
7 research/reduce-latency 89.00 keep KD-tree neighbor search

Example is simulated. The skill works on any codebase — run it and share your real results.

Anything where you can measure or evaluate a result:

  • API latency — p50, p99, throughput
  • Test speed — suite runtime, parallelization strategies
  • Bundle size — tree-shaking, code splitting, dependency swaps
  • Prompt engineering — accuracy, cost, token usage
  • Algorithm tuning — runtime complexity, memory usage
  • Configuration optimization — DB settings, cache sizes, thread pools

The agent interviews you about what to optimize, sets up a lab on a git branch, then works autonomously — thinking, testing, reflecting — committing before every experiment, reverting on failure, logging everything. It forks branches to explore divergent approaches, detects when it's stuck, and keeps going until you stop it or it hits a target.

Generalizes autoresearch beyond ML: supports thought experiments, non-linear branching, qualitative metrics, convergence signals, and session resume.

All experiment history lives in an untracked .lab/ directory that survives all git operations — git manages code, .lab/ manages knowledge.

MIT


If this is useful, star the repo and share what you discovered.

Yggdrasil – your agent keeps forgetting your architecture, constraints, and past decisions. Yggdrasil gives your repository persistent semantic memory, so each task starts with the right context instead of another giant prompt.

联系我们 contact @ memedata.com