Ruby 执行 JIT 代码：幕后隐藏的机制

Ruby 执行 JIT 代码：幕后隐藏的机制
How Ruby executes JIT code

原始链接: https://railsatscale.com/2025-09-08-how-ruby-executes-jit-code-the-hidden-mechanics-behind-the-magic/

## 理解 Ruby 的 JIT 编译器 (YJIT & ZJIT) Ruby 的 JIT（即时编译）编译器，例如 YJIT 和更新的 ZJIT，通过将字节码转换为本机机器码来显著提高性能。然而，这个过程比简单的替换更复杂。Ruby 在每个方法的 ISEQ（指令序列）数据结构中保留字节码 *和* 编译后的代码。`jit_entry` 字段指向可用的编译代码；否则，Ruby 将解释字节码。编译并非自动进行。方法会被分析 – 最初被解释，然后被监控 – 并且只有在达到特定调用阈值后（ZJIT 中目前是 25 用于分析，30 用于编译）才会进行编译。这段“预热”时间对于达到最佳性能至关重要。然而，JIT 编译的代码并非万无一失。它依赖于对代码行为的假设（例如，数据类型）。如果这些假设被违反（例如，当它期望整数时，调用一个将浮点数作为参数的加法方法），代码会“反优化”，恢复到解释执行以确保正确的结果。激活 `TracePoint` 调试或更改核心方法等事件也会触发反优化。这种动态方法在速度、安全性和效率之间取得了平衡，避免对很少使用的代码进行不必要的编译，并保证即使假设发生变化也能正确执行。要深入了解，请参阅“ZJIT 已合并到 Ruby 中”和 Kevin Newton 的“Advent of YARV”系列。

## Ruby JIT 编译讨论于 Hacker News Hacker News 上进行了一场关于通过即时编译 (JIT) 提高 Ruby 性能可能性的讨论。Ruby 的动态特性给 JIT 编译器带来了挑战——由于运行时代码修改，需要不断验证假设——但一些评论员指出，.NET 和 JavaScript 引擎等其他虚拟机中已有的解决方案，使用后台编译（“分层编译”）。核心争论在于 Ruby 是否能达到 JVM 的速度。许多人认为这不太可能，因为 Ruby 的设计优先考虑灵活性而非静态优化。然而，TruffleRuby 和 JRuby（在 JVM 上运行的 Ruby 实现）表明 Ruby *可以* 利用 JVM 的性能，尽管存在权衡。其他提出的想法包括 Ruby 中的一个受限制的“子语言”，禁用动态特性以方便编译，或者一种机制，向 JIT 编译器发出信号，告知代码部分具有有限的动态性。讨论还涉及 VM 快照以加快启动时间，并承认 Ruby VM 开发与 JVM 之间的巨大投资差距。

原文

Ever since YJIT’s introduction, I’ve felt simultaneously close to and distant from Ruby’s JIT compiler. I know how to enable it in my Ruby programs. I know it makes my Ruby programs run faster by compiling some of them into machine code. But my understanding around YJIT, or JIT compilers in Ruby in general, seems to end here.

A few months ago, my colleague Max Bernstein wrote ZJIT has been merged into Ruby to explain how ZJIT compiles Ruby’s bytecode to HIR, LIR, and then to native code. It sheds some light on how JIT compilers can compile our program, which is why I started to contribute to ZJIT in July. But I still had many questions unanswered before digging into the source code and asking the JIT experts around me (Max, Kokubun, and Alan).

So I want to use this post to answer some questions/mental gaps you might also have about JIT compilers for Ruby:

Where does JIT-compiled code actually live?
How does Ruby actually execute JIT code?
How does Ruby decide what to compile?
Why does JIT-compiled code fall back to the interpreter?

While we use ZJIT (Ruby’s experimental next-generation JIT) as our reference, these concepts apply equally to YJIT as well.

Where JIT-Compiled Code Actually Lives

Ruby ISEQs and YARV Bytecode

When Ruby loads your code, it compiles each method into an Instruction Sequence (ISEQ) - a data structure containing YARV (CRuby virtual machine) bytecode instructions.

(If you’re not familiar with YARV instructions or want to learn more, Kevin Newton wrote a great blog series to introduce them)

Let’s start with a simple example:

def foo
  bar
end

def bar
  42
end

Running ruby --dump=insn example.rb shows us the bytecode:

== disasm: #<ISeq:[email protected]:1 (1,0)-(3,3)>
0000 putself                                                          (   2)[LiCa]
0001 opt_send_without_block                 <calldata!mid:bar, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0003 leave                                  [Re]

== disasm: #<ISeq:[email protected]:5 (5,0)-(7,3)>
0000 putobject                              42                        (   6)[LiCa]
0002 leave                                  [Re]

JIT-Compiled Code Lives on ISEQ Too

I assumed JIT-compiled code would replace bytecode—after all, native code is faster. But Ruby keeps both, for good reason.

Here’s what an ISEQ looks like initially:

ISEQ (foo method)
├── body
│   ├── bytecode: [putself, opt_send_without_block, leave]
│   ├── jit_entry: NULL  // No JIT code yet
│   ├── jit_entry_calls: 0  // Call counter

After the method is called repeatedly and gets JIT-compiled:

ISEQ (foo method)
├── body
│   ├── bytecode: [putself, opt_send_without_block, leave]  // Still here!
│   ├── jit_entry: 0x7f8b2c001000  // Pointer to native machine code
│   ├── jit_entry_calls: 35  // Reached compilation threshold

The jit_entry field is the gateway to native code. When it’s NULL, Ruby interprets bytecode. When it points to compiled code, Ruby can jump directly to machine instructions. But the bytecode never goes away - Ruby needs it for de-optimization, which we will explore a bit later.

The Execution Switch: From Bytecode to Native Code

This is easier than I expected. Since each ISEQ points to its JIT compiled code when it’s available, Ruby simply checks the jit_entry field on every ISEQ it’s going to execute:

JIT-compiled code execution

When there’s no JIT code (jit_entry is NULL), it continues interpreting. Otherwise, it runs the compiled native code.

How Ruby Decides What to Compile

Ruby doesn’t compile methods randomly or all at once. Instead, methods earn compilation through repeated use. In ZJIT, this happens in two phases:

if (body->jit_entry == NULL && rb_zjit_enabled_p) {
    body->jit_entry_calls++;

    // Phase 1: Profile the method
    if (body->jit_entry_calls == rb_zjit_profile_threshold) {
        rb_zjit_profile_enable(iseq);
    }

    // Phase 2: Compile to native code
    if (body->jit_entry_calls == rb_zjit_call_threshold) {
        rb_zjit_compile_iseq(iseq, false);
        // After this, jit_entry points to machine code
    }
}

As of now, ZJIT’s default profile threshold is 25 and compile threshold is 30 (both may change in the future). So a method’s lifecycle may look like this:

Calls:     0 ─────────── 25 ────────── 30 ─────────────────►
           │              │             │
Mode:      └─ Interpret ──┴── Profile ──┴─ Native Code (JIT compiled)

This is why we need to “warm up” the program before we get the peak performance with JIT.

When JIT Code Gives Up: Understanding De-optimization

JIT code makes assumptions to run fast. When those assumptions break, Ruby must “de-optimize” - return control to the interpreter. It’s a safety mechanism that ensures your code always produces correct results.

Consider this method:

which would generate these instructions:

== disasm: #<ISeq:[email protected]:1 (1,0)-(3,3)>
0000 getlocal_WC_0                          a@0                       (   2)[LiCa]
0002 getlocal_WC_0                          b@1
0004 opt_plus                               <calldata!mid:+, argc:1, ARGS_SIMPLE>[CcCr]
0006 leave                                                            (   3)[Re]

Because Ruby doesn’t know what opt_plus would be called with beforehand, the underlying C function vm_opt_plus needs to handle various classes (like String, Array, Float, Integer, etc.) that can respond to +.

But, if profiling shows add is always called with integers (Fixnums), JIT compilers can generate optimized code that only handles integer addition. But it includes “guards” to check this assumption:

JIT type guard

When the assumption is broken, like when add(1.5, 2) is called:

The guard check fails
JIT code jumps to a “side exit”
The side exit restores interpreter state (stack, instruction pointer..etc.)
Control returns to the interpreter
The interpreter executes opt_plus and calls the vm_opt_plus function

Other triggers for falling back include:

TracePoint activation - TracePoint needs bytecode execution for properly emitting events (more details below)
Redefined core methods - Someone changed what + means on Integer
Ractor usage - Multi-ractor changes some YARV instruction’s behaviour. So the compiled code could perform differently than the interpreter in that situation

These assumption checks, or patch points as we call them in ZJIT, make sure your program performs correctly when any of the assumptions change.

Answering Some Additional Questions

Why does enabling TracePoint slow everything down?

(TracePoint is a Ruby class that can be used to register callbacks on specific Ruby execution events. It’s commonly used in debugging/development tools.)

Most of TracePoint’s events are triggered by corresponding YARV bytecode. When TracePoint is activated, instructions in ISEQs will be replaced with their trace_* counterpart. Like opt_plus will be replaced with trace_opt_plus.

If Ruby only executes the compiled machine code, then those events wouldn’t be triggered correctly. Therefore, when ZJIT and YJIT compilers detect TracePoint’s activation, they immediately throw away the optimized code to force Ruby to interpret YARV instructions instead.

Why doesn’t Ruby just compile everything?

Many methods are called rarely. Compiling them would waste memory and compilation time for no performance benefit. Also, compiling methods without profiling would mean that JIT compilers either make wrong assumptions that get invalidated pretty quickly, or don’t make specific enough assumptions that miss further optimization opportunities.

Final Notes

I hope this post helped you understand JIT compilers, a now essential part of Ruby, a little bit more.

If you want to learn more about Ruby’s new JIT compiler: ZJIT, I highly recommend giving ZJIT has been merged into Ruby a read. And if you want to learn more about Ruby’s YARV instructions, Kevin Newton’s Advent of YARV series is the best resource.