FORTH？真的吗！？

FORTH？真的吗！？
FORTH? Really!?

原始链接: https://rescrv.net/w/2026/02/06/associative

这项研究探讨了大型语言模型（LLM）是否能从FORTH和关联编程语言架构中受益，从而摆脱递归、自顶向下的问题解决方式。核心思想是**优先考虑连接而非整合**，从基础开始构建解决方案——在确定上下文*之前*生成组件，这反映了我们预测序列中下一个词的方式。作者使用“奇偶树”基准测试验证了这一假设。模型（Opus和Haiku）的任务是构建表示数字序列奇偶性（偶数/奇数）的二叉树，使用前缀（自顶向下）和后缀（自底向上）表示法。结果表明，**后缀表示法始终优于前缀表示法**，并且**Opus的性能显著优于Haiku**。这表明模型在首先生成子解决方案时“思考”得更有效，这与关联方法相符。作者建议这一原则可以为数据库层优化提供信息，利用有限自动机的转换。最终，这项研究倡导LLM架构向逐步构建解决方案转变，而不是递归分解问题。

## FORTH 与 LLM：Hacker News 讨论最近一则 Hacker News 帖子引发了关于编程语言 FORTH 及其与大型语言模型 (LLM) 潜在相关性的讨论。最初的发布者 (rescrv) 质疑 LLM 是否在使用类似于 FORTH 后缀表示法的语言时，性能会更好。对话显示了各种观点。一些人认为 FORTH 的简单性和高效的通用学习特性可能是有益的，而另一些人则认为 LLM 足够复杂，可以处理更复杂的语法，并且 FORTH 的隐式堆栈操作实际上会*阻碍*性能，因为它需要全局推理。几位评论者分享了他们使用 FORTH 的经验，指出它易于实现，并且程序员倾向于创建自己的解释器。一位用户强调了一篇最近的首页文章，展示了一个富有创意的 FORTH 应用程序。 Rescrv 分享了一个基准测试 ([https://github.com/rescrv/stack-bench](https://github.com/rescrv/stack-bench))，以便其他人测试他们的假设，并强调他们并非提倡直接使用 FORTH，而是探索更紧凑的语言在数据查询方面的潜力。

原文

Imagine you have to generate the word that succeeds this colon: ___

What would you put in that blank space?

It’s easier when the question comes first.

But what if we structured things such that the blank had to be generated before its constituent parts. LLMs are wonderful, but I see too many people try to break down recursively to solve problems like top-down humans do. Instead, I posit that FORTH and associative/applicative languages may be better for transformer architectures. Concatenate, not integrate. Agree on the stack state.

I set out to question if this could be true.

Sideways Passing Join.

Imagine you had this program:

A SCAN [foo > 5] FILTER
B SCAN [foo < 5] FILTER
BUILD
PROBE

that performs a natural join on A and B’s shared identifiers.

Because of the properties of associative languages you can always make local edits. For example, if you made a sed-like transformation, you could replace BUILD PROBE with the following anywhere there’s a BUILD PROBE sequence to do a sideways-information-passing join:

DUP STATS SWAP BUILD
[PUSHDOWN] DIP PROBE

This same associative property allows us to divide a program into, “What’s been generated in-context,” and, “What remains to be generated.” We shuffle one token at a time to extend the context and consume our desire to generate tokens.

I have a hunch that transformations of finite automatons over subsequences of the text can be used to write optimization passes for the database layer.

A phrase from Manfred von Thun goes, “syntactic concatenation is semantic composition.”

A Benchmark

I set out to benchmark what models can do in this regard. Would the order of terms matter to an attention transformer? The experiment is simple: I want to construct a tree over numbers and measure when the tree conforms to instructions. In my experiment I used parity to assess whether the sum of a sub-tree’s children were even or odd. Thus, prefix notation needs to know the overall answer before it generates the sub-answers. Postfix notation generates bottom-up, generating sub-answers before answering further.

If you think about how you answer, “What is the next token,” you’ll see where I’m going.

Setup

Given: A sequence of numbers. Construct: A prefix or postfix parity tree.

What is a parity tree? An unbalanced, left- or right-skewed binary tree whose leaves are numbers and whose interior nodes represent the parity of their transitive children.

Results

I ran four trials across Opus and Haiku (sonnet gave results I need to better understand before I’ll publish). Thinking consistently outperforms non-thinking. Opus consistently outperforms Haiku. And postfix consistently outperforms prefix.

Model	Thinking	Postfix Acc	Prefix Acc	Both Correct	Postfix Only	Prefix Only	Both Wrong
Haiku	Yes	88.3%	36.7%	110	155	0	35
Haiku	No	6.7%	4.3%	9	11	4	276
Opus	Yes	98.3%	81.3%	243	52	1	4
Opus	No	50.0%	9.7%	28	122	1	149

All makes sense in the world.

FORTH？真的吗！？ FORTH? Really!?

Sideways Passing Join.

A Benchmark

Setup

Results

FORTH？真的吗！？
FORTH? Really!?