将 React Compiler 移植到 Rust
Port React Compiler to Rust

原始链接: https://github.com/react/react/pull/36173

为了处理此前会导致反序列化失败的未建模 Babel 语句类型,AST 现在新增了一个 `Unknown(UnknownStatement)` 变体。这使得系统能够按原样保留未建模的语法,而非直接崩溃,从而与 TypeScript 的行为保持一致。 主要技术改进包括: * **稳健的反序列化:** 手写的 `serde` 实现通过 `known_statements!` 宏分发已建模的标签,确保畸形的已建模节点能触发精确的错误,而只有真正未知的标签才会回退到 `Unknown` 变体。 * **完整性与安全性:** 系统通过一个拒绝 `type` 修改的范围限定修改器(scoped mutator),防止原始节点与位置辅助器之间出现不同步。为适应这一特定的 `Statement` 异常,修订了“无捕获所有(no-catch-all)”策略。 * **代码生成判别:** 通过显式判别确保表达式和模式节点得到正确处理,防止了因将表达式节点误认为原始语句而导致的“静默孤儿(silent orphan)”回归问题。 * **性能:** 通过在类型化解析前为每个语句具体化一个 `serde_json::Value`,系统在保持现有渐进性能的同时,提高了错误的粒度。 这些更改已通过详尽的单元测试和集成测试验证,确保与 Babel 处理未建模语法的方式保持一致。

Hacker News | 最新 | 过往 | 评论 | 提问 | 展示 | 招聘 | 提交 | 登录 将 React 编译器移植到 Rust (github.com/react) 16 点,由 boudra 发布于 39 分钟前 | 隐藏 | 过往 | 收藏 | 4 条评论 void 更新于 0 分钟前 | 下一条 [–] React 编译器目前是用什么语言编写的? 回复 willsmith72 2 分钟前 | 上一条 | 下一条 [–] 真的有人在使用 React 编译器吗?自从很久以前听说它速度极慢之后就再没听到过相关消息了。 回复 Trung0246 13 分钟前 | 上一条 | 下一条 [–] 出于好奇,我们可以使用 Lean4 作为移植目标而不是 Rust 吗? 回复 jon-wood 4 分钟前 | 父评论 [–] 我相信从技术上讲是可能的。但如果你希望有人改变方向,将它移植到一种几乎没人听说过的编程语言,而不是 Rust,你可能需要提供更多的背景信息。你为什么认为这是一个好主意? 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文
Babel can emit statement kinds the typed AST does not model (the
todo-ts-* fixtures pin three TS module-interop forms). Deserialization
previously failed the whole file on the first such node, while the TS
reference compiles the file and leaves the statement alone.

Statement gains a final #[serde(untagged)] Unknown(UnknownStatement)
variant carrying the complete raw node. Deserialization is hand-written
and dispatches modeled `type` tags through a KnownStatement helper so a
malformed modeled node still errors with its precise field-level
message instead of degrading to Unknown; only genuinely unmodeled tags
take the catch-all. The TS reference reaches its equivalent default
case only via assertExhaustive (Babel's closed types), so it crashes;
here unmodeled syntax is reachable by construction and degrades
instead: top-level statements are preserved verbatim through
re-serialization, and function-body occurrences record the standard
UnsupportedSyntax bailout with an UnsupportedNode instruction carrying
the raw node. A known_statements! macro is the single source for the
dispatch enum, its From mapping, and the tag list, so those three
cannot drift; a variant added to Statement but not the macro is the one
remaining silent gap, documented on the variant.

UnknownStatement caches BaseNode for position helpers; the scoped
with_raw_mut mutator refreshes the cache and rejects mutations that
strip `type`, so the two views cannot desync. Program-level analyses
treat Unknown explicitly: the gating reference-before-declaration scan
walks the raw node for identifier references (an `export = X` does
reference X), and the prefilter and return-analysis arms are
deliberately inert. SWC/OXC reverse converters emit a deliberate
runtime tripwire (a throw in generated code) for the arms that are
unreachable until the SWC forward conversion stops rewriting these
statements to EmptyStatement in the next slice.

Deserialization now materializes a serde_json::Value per statement
before typed parsing. The cost is one move-based tree rebuild per
nesting level at a one-time boundary; the previous derive also buffered
every node through serde's internal Content to read the tag, so the
delta is allocation shape, not asymptotics.

Verified: ast unit tests including malformed/edge cases, a lowering
integration test pinning the function-body bailout, round_trip green on
the three fixtures, scoped and full Babel e2e green on all three with
events parity, cargo test --workspace green. The scope-resolution half
of test-babel-ast.sh is green on this stack's base and remains red
corpus-wide on the pr-36173 tip, whose node-ID migration removed
position-based keying while babel-ast-to-json.mjs still emits
offset-based scope JSON; that generator gap needs its own fix before
this stack rebases onto the tip. rust-port-0001-babel-ast.md's no-catch-all policy is
amended to document Statement as the deliberate exception.

Port adaptation for this branch's UnsupportedNode codegen fix
(0957b55), which discriminated statement-vs-expression
original_node by attempting a Statement deserialization. With the
tolerant deserializer that attempt succeeds for every tagged object,
which would silently emit expression nodes as raw statements and
orphan their lvalue temporaries — regressing the ~10 fixtures that
commit fixed. The codegen site now discriminates explicitly
(codegen_unsupported_original_node): modeled statement tags parse
typed and a parse failure is an invariant, not a degrade; tags that
parse as Expression or PatternLike (both strict enums, no catch-all)
flow through expression codegen unchanged, preserving the lvalue
binding and the pattern placeholder fallback; only genuinely unmodeled
tags — producible solely by the unknown-statement lowering bailout,
i.e. from statement position — degrade to Statement::Unknown and are
emitted verbatim, matching TS codegen's 'return node'.
is_known_statement_type is now exposed (pub) from the
known_statements! macro for this, and unit tests pin the
dispatch (modeled statement tag, malformed modeled tag, expression
tag, pattern tag, unknown tag).
联系我们 contact @ memedata.com