是时候将您的文档迁移到代码仓库了——特别是考虑到人工智能。
It's time to move your docs in the repo

原始链接: https://www.dein.fr/posts/2026-03-13-its-time-to-move-your-docs-in-the-repo

## 人工智能时代的文档 核心思想:**将文档视为代码仓库中的一级公民。** 正如代码需要版本控制一样,文档也需要——尤其是在人工智能兴起的情况下。 “最苍白的墨水比最强大的记忆更可靠” 这句谚语比以往任何时候都更具现实意义。 人工智能代理正在*大幅增加*文档(通常以 markdown 格式通过规则文件),但这凸显了对清晰、人类可读规范的需求。 这些“规则”通常代表未记录的最佳实践,模糊了人工智能生成内容和人类编写内容之间的界限。 未来,我们可能会将重点从代码审查转移到*更多地*审查定义明确的规范和指南。 将文档移入仓库可以解决文档过时的问题,因为人工智能可以帮助保持代码和文档的一致性。 它还为人工智能代理提供关键上下文,通过将知识(例如基础设施经验)直接置于代码库中来节省时间和 token。 虽然像 Google Docs 这样的工具仍然对*协作起草*很有价值,但最终的、稳定的文档应该与代码一起存储,以便进行版本控制、轻松更新并受益于标准的编辑工具。 最终,文档应该编写成供人类审查,以确保在日益人工智能驱动的开发环境中保持清晰度和可维护性。

## 仓库内的文档:一则黑客新闻讨论 最近一则黑客新闻讨论了将文档存储在项目仓库*内部*(“仓库内的文档”)的做法日益普及。虽然自 GitHub Pages 等工具出现以来,这就被认为是最佳实践,但此次讨论凸显了由人工智能进步推动的重新关注。 用户报告了简化的更新等好处——人工智能工具现在可以随着代码更改自动更新文档,并快速识别差异。单仓库结构(所有代码和文档都在一个仓库中)对于需要集中知识源的人工智能“代理”来说尤其有利。 然而,人们也提出了关于从单仓库中进行版本控制和发布组件的挑战,并提倡使用多仓库,并进行适当的版本控制和打包。关于人工智能访问文档的安全问题也引起了关注,以及对*仅仅*因为人工智能趋势而采用更改的怀疑。最终,讨论强调了核心原则:文档,就像代码一样,应该易于人类阅读,并且易于与项目本身一起维护。
相关文章

原文

The palest ink is more reliable than the most powerful memory. – Chinese proverb

AI changes the game when it comes to having all your docs in your repository: it's never been that easy to keep them up to date!

I've always been a fan of having documentation living alongside the code:

  • Version control: just like code, documentation evolve. Why use a different versioning control system when you're already using git? Especially when multiple people are changing docs at the same time, with potentially conflicting changes.
  • Proximity with code: e.g., rg or grep will yield code and documentation results, making it much easier to keep it up to date.
  • Formal approval: in the spirit of documentation-driven development, starting with a review of documentation updates help understand the final product/API. (For active collaboration other tools, e.g., Google Docs still provide a superior UX).
  • Automatic generation: when using a different system for hosting the docs (Google Docs, Confluence, Notion, etc.), it's quite laborious to copy-paste APIs and example code. There are many tools (e.g., Sphinx's autodoc, jsdoc, javadoc, docusaurus) that can generate API docs directly from the code.
  • Testing: static code examples in documentation are a good start, but it's even better when they're tested, which you can do when running code examples in docs is part of your continuous integration process. See Python's doctest for example. In a way, the documentation is the spec.
  • Efficient editing: you benefit from all your text editor tools, and can script mass-changes.

We will be spending more time writing docs

First observation: AI agents have considerably increased the proportion of markdown files in commits. That's usually because folks check out the agent's implementation, which is a very good idea. It's also because you save a lot agentic iteration time when you write rules files (.mdc files) to guide agents' execution. So, whether or not you agree with the thesis, it is happening.

I would submit that 80% of the rules file could have been documentation instead, or potentially are already documented elsewhere. Just like code should be primarily written for humans to read, all files in a repository is written primarily for humans to review. This also applies to rules files created to guide agentic execution. Rules files are reading more and more like style guides and best practices that we never bothered writing but probably should have solidified.

The frontier between AI-only markdown and human-only is so blurry that I could see rules files completely disappearing and being replaced by documentation.

This is also consistent with engineers shifting their focus left. Engineering tooling has trended towards higher and higher abstractions, from machine code to C, to dynamic languages, to SDK, and now to not even writing code and only focusing on the spec, and guidelines. Just like we don't review the machine code produced by the compiler, there might be a day where we don't review the code generated by an LLM provided it respects the harness, the specs and the guidelines (security will be a key concern there). In that world, we'll spend most of our energy reviewing the specs, the harness, the guidelines. Conclusion: those docs need to be written first and foremost for humans review.

Why AI makes it even more meaningful to move docs into your repo

AI agents solve stale docs. A common objection to writing docs is: "why bother? read the code - the code is always up to date". (the same line of arguments would apply to brushing your teeth). AI agents solve this problem. They take away the laborious work to ensure code & documentation alignment (either in PR, or with specialized review agents that look for documentation inconsistencies). Quite game-changing.

AI agents benefit from higher level context. Moving your docs (including, perhaps, your architecture proposals - RFCs, your product specs - PRDs, etc.) will provide that extra context.

Materialized plans with findings will save token & iteration time. Imagine researching "the best way to do X" in a massive codebase. You will spend a lot of tokens finding the answer to that question. Documenting the answer to that question and materializing it in the repository will enable your colleague to skip that research step later (and keep it up-to-date with extra learnings, best practices, etc.!). This is especially the case for things that agents can't infer from the code, or are difficult to infer: typically, infra-related things you learned by deploying your code to production. For instance, I spent about two weeks researching and iterating on structured logging best practices - I materialized that into a "metaplan" that other teams can use directly, saving everyone (including agents!) a ton of time!

Answer to objections

You could use MCP and other approaches (skills) to give agent access to your documentation. But the same arguments I laid out in the beginning still apply, especially the version control piece. Most documentation system are not designed for fast iteration with strong concurrency control.

Waiting for a code review for docs will deter from updating docs. (1) What if you weren't the one writing those docs? (2) Who says all repo content change need to go through review? (3) As we shift more and more left, won't the documentation change or implementation plan be the most important thing to review?

AI agents write long, convoluted docs. First response: well, most humans do as well :). Just like code, you should (1) review the agent's work, (2) fix the agent's work, and (3) write your own docs (like this article: none of this content has been generated by AI!). Putting it into version control makes it MUCH easier and safer (reviews! history!) to iterate.

Do I really need to move all my documentation? I'd say yes, at this point. Not the fleeting docs, but everything that provide useful context about the codebase, including RFCs.

[your_preferred_tool] is better at [tables/schemas/links]. AI is getting incredibly good at generating mermaid schemas (supported by Github), tables, etc.

[your_preferred_tool] is better for human collaboration. Yes, Google Docs is still much better for active collaboration, so it's fair to continue using it for that use case. But once the documentation is in a good place, I would move it in the repo (Google Docs has a useful "Copy as Markdown" feature that I use all the time).

Non-engineers usually don't have repo access. (1) You can deploy your docs on an internal-only website. (2) There is clear trend with non-engineer code access (which poses some interesting security challenges).

References and articles

As always, there are more resources on my repo charlax/professional-programming

联系我们 contact @ memedata.com