我们应该在智能体时代重新审视文学编程。

我们应该在智能体时代重新审视文学编程。
We should revisit literate programming in the agent era

原始链接: https://silly.business/blog/we-should-revisit-literate-programming-in-the-agent-era/

## 结合人工智能代理的重新构想的文学编程文学编程——将代码与解释性散文结合起来——旨在使代码库像叙述一样易于理解。虽然概念上很有吸引力，但历史上由于维护并行代码和文档的负担而受挫。像 Jupyter Notebook 和 Emacs Org Mode 这样的常见实现，虽然有用，但对于大型项目而言，往往会变得繁琐，需要持续的“解缠”（代码提取），并存在覆盖更改的风险。然而，大型语言模型 (LLM) 代理的兴起正在重振这一概念。像 Claude 和 Kimi 这样的代理擅长理解和生成 Org Mode 等格式，有效地处理维护同步散文和代码的复杂性。现在，代理可以*自动*管理解缠，将集成了散文的文件视为事实来源，并始终用自然语言重新解释代码更改。这消除了之前阻碍采用的核心工作。作者发现这对于测试和记录流程特别有效，并设想未来代码库可以轻松阅读的叙述，甚至可以通过上下文意图解释来提高代码质量。虽然目前在较小规模上进行测试，但大型、代理维护的文学代码库的潜力令人信服，即使存在格式限制（例如 Org Mode 与 Emacs 的关联）。

原文

Literate programming is the idea that code should be intermingled with prose such that an uninformed reader could read a code base as a narrative, and come away with an understanding of how it works and what it does.

Although I have long been intrigued by this idea, and have found uses for it in a couple of different cases, I have found that in practice literate programming turns into a chore of maintaining two parallel narratives: the code itself, and the prose. This has obviously limited its adoption.

Historically in practice literate programming is most commonly found as Jupyter notebooks in the data science community, where explanations live alongside calculations and their results in a web browser.

Frequent readers of this blog will be aware that Emacs Org Mode supports polyglot literate programming through its org-babel package, allowing execution of arbitrary languages with results captured back into the document, but this has remained a niche pattern for nerds like me.

Even for someone as enthusiastic about this pattern as I am, it becomes cumbersome to use Org as the source of truth for larger software projects, as the source code essentially becomes a compiled output, and after every edit in the Org file, the code must be re-extracted and placed into its destination ("tangled", in Org Mode parlance). Obviously this can be automated, but it's easy to get into annoying situations where you or your agent has edited the real source and it gets overwritten on the next tangle.

That said, I have had enough success with using literate programming for bookkeeping personal configuration that I have not been able to fully give up on the idea, even before the advent of LLMs.

For example: before coding agents, I had been adapting a pattern for using Org Mode for manual testing and note-taking: instead of working on the command line, I would write more commands into my editor and execute them there, editing them in place until each step was correct, and running them in-place, so that when I was done I would have a document explaining exactly the steps that were taken, without extra steps or note-taking. Combining the act of creating the note and running the test gives you the notes for free when the test is completed.

This is even more exciting now that we have coding agents. Claude and Kimi and friends all have a great grasp of Org Mode syntax; it's a forgiving markup language and they are quite good at those. All the documentation is available online and was probably in the training data, and while a big downside of Org Mode is just how much syntax there is, but that's no problem at all for a language model.

Now when I want to test a feature, I ask the clanker to write me a runbook in Org. Then I can review it – the prose explains the model's reflection of the intent for each step, and the code blocks are interactively executable once I am done reviewing, either one at a time or the whole file like a script. The results will be stored in the document, under the code, like a Jupyter notebook.

I can edit the prose and ask the model to update the code, or edit the code and have the model reflect the meaning upon the text. Or ask the agent to change both simultaneously. The problem of maintaining the parallel systems disappears.

The agent is told to handle tangling, and the problem of extraction goes away. The agent can be instructed with an AGENTS.md file to treat the Org Mode file as the source of truth, to always explain in prose what is going on, and to tangle before execution. The agent is very good at all of these things, and it never gets tired of re-explaining something in prose after a tweak to the code.

The fundamental extra labor of literate programming, which I believe is why it is not widely practiced, is eliminated by the agent and it utilizes capabilities the large language model is best at: translation and summarization.

As a benefit, the code base can now be exported into many formats for comfortable reading. This is especially important if the primary role of engineers is shifting from writing to reading.

I don't have data to support this, but I also suspect that literate programming will improve the quality of generated code, because the prose explaining the intent of each code block will appear in context alongside the code itself.

I have not personally had the opportunity to try this pattern yet on a larger, more serious codebase. So far, I have only been using this workflow for testing and for documenting manual processes, but I am thrilled by its application there.

I also recognize that the Org format is a limiting factor, due to its tight integration with Emacs. However, I have long believed that Org should escape Emacs. I would promote something like Markdown instead, however Markdown lacks the ability to include metadata. But as usual in my posts about Emacs, it's not Emacs's specific implementation of the idea that excites me, as in this case Org's implementation of literate programming does.

It is the idea itself that is exciting to me, not the tool.

With agents, does it become practical to have large codebases that can be read like a narrative, whose prose is kept in sync with changes to the code by tireless machines?

I think that's a compelling question.

我们应该在智能体时代重新审视文学编程。 We should revisit literate programming in the agent era

我们应该在智能体时代重新审视文学编程。
We should revisit literate programming in the agent era