GDB JIT 接口

GDB JIT 接口
The GDB JIT Interface

原始链接: https://bernsteinbear.com/blog/gdb-jit/

## 使用 GDB 调试 JIT 编译代码 GDB 擅长通过单步执行机器码来调试，依赖于编译器（如 Clang 和 GCC）在二进制文件中嵌入的调试信息（通常是 DWARF）。然而，GDB 在处理即时编译 (JIT) 代码时遇到困难，因为这种代码缺乏预先存在的调试信息，导致输出无用信息，例如“??”。幸运的是，GDB 提供了一个 JIT 接口。这要求 JIT 编译器向 GDB 注册新编译的函数，提供必要的调试数据。主要有两种方法： **1. 旧接口：** 涉及为每个 JIT 编译的函数创建包含 DWARF 数据的内存对象文件（ELF、Mach-O），并管理这些文件的链表。这种方法复杂，并且由于链表结构可能导致性能问题（O(n²) 行为）。 **2. 新接口：** 允许使用自定义调试信息格式，需要将共享对象加载到 GDB 中作为读取器。这提供了更大的灵活性，但要求实现特定接口并处理数据解析。其他方法，例如利用 Linux `perf` 接口，正在被探索以简化该过程。一个关键的挑战是确保代码指针在垃圾回收期间保持稳定，通常需要禁用移动垃圾回收器或使用弱引用来管理已注册的代码。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录 GDB JIT 接口 (bernsteinbear.com) 6 分，surprisetalk 发表于 2 小时前 | 隐藏 | 过去 | 收藏 | 1 条评论 rurban 发表于 28 分钟前 [–] 这总是需要花费太多精力，却收效甚微。我过去在类似情况下调试时，会临时构建 C 代码等价物，也用 -g 选项编译，将源文件设置为这个文件，然后就可以轻松调试函数了。指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

原文

GDB is great for stepping through machine code to figure out what is going on. It uses debug information under the hood to present you with a tidy backtrace and also determine how much machine code to print when you type disassemble.

This debug information comes from your compiler. Clang, GCC, rustc, etc all produce debug data in a format called DWARF and then embed that debug information inside the binary (ELF, Mach-O, …) when you do -ggdb or equivalent.

Unfortunately, this means that by default, GDB has no idea what is going on if you break in a JIT-compiled function. You can step instruction-by-instruction and whatnot, but that’s about it. This is because the current instruction pointer is nowhere to be found in any of the existing debug info tables from the host runtime code, so your terminal is filled with ???. See this example from the V8 docs:

#8  0x08281674 in v8::internal::Runtime_SetProperty (args=...) at src/runtime.cc:3758
#9  0xf5cae28e in ?? ()
#10 0xf5cc3a0a in ?? ()
#11 0xf5cc38f4 in ?? ()
#12 0xf5cbef19 in ?? ()
#13 0xf5cb09a2 in ?? ()
#14 0x0809e0a5 in v8::internal::Invoke (...) at src/execution.cc:97

Fortunately, there is a JIT interface to GDB. If you implement a couple of functions in your JIT and run them every time you finish compiling a function, you can get the debugging niceties for your JIT code too. See again a V8 example:

#6  0x082857fc in v8::internal::Runtime_SetProperty (args=...) at src/runtime.cc:3758
#7  0xf5cae28e in ?? ()
#8  0xf5cc3a0a in loop () at test.js:6
#9  0xf5cc38f4 in test.js () at test.js:13
#10 0xf5cbef19 in ?? ()
#11 0xf5cb09a2 in ?? ()
#12 0x0809e1f9 in v8::internal::Invoke (...) at src/execution.cc:97

Unfortunately, the GDB docs are somewhat sparse. So I went spelunking through a bunch of different projects to try and understand what is going on.

The big picture (and the old interface)

GDB expects your runtime to expose a function called __jit_debug_register_code and a global variable called __jit_debug_descriptor. GDB automatically adds its own internal breakpoints at this function, if it exists. Then, when you compile code, you call this function from your runtime.

In slightly more detail:

Compile a function in your JIT compiler. This gives you a function name, maybe other metadata, an executable code address, and a code size
Generate an entire ELF/Mach-O/… object in-memory (!) for that one function, describing its name, code region, maybe other DWARF metadata such as line number maps
Write a jit_code_entry linked list node that points at your object (“symfile”)
Link it into the __jit_debug_descriptor linked list
Call __jit_debug_register_code, which gives GDB control of the process so it can pick up the new function’s metadata
Optionally, break into (or crash inside) one of your JITed functions
At some point, later, when your function gets GCed, unregister your code by editing the linked list and calling __jit_debug_register_code again

This is why you see compiler projects such as V8 including large swaths of code just to make object files:

Because this is a huge hassle, GDB also has a newer interface that does not require making an ELF/Mach-O/…+DWARF object.

Custom debug info (the new interface)

This new interface requires writing a binary format of your choice. You make the writer and you make the reader. Then, when you are in GDB, you load your reader as a shared object.

The reader must implement the interface specified by GDB:

GDB_DECLARE_GPL_COMPATIBLE_READER;
extern struct gdb_reader_funcs *gdb_init_reader (void);
struct gdb_reader_funcs
{
  /* Must be set to GDB_READER_INTERFACE_VERSION.  */
  int reader_version;

  /* For use by the reader.  */
  void *priv_data;

  gdb_read_debug_info *read;
  gdb_unwind_frame *unwind;
  gdb_get_frame_id *get_frame_id;
  gdb_destroy_reader *destroy;
};

The read function pointer does the bulk of the work and is responsible for matching code ranges to function names, line numbers, and more.

Here are some details from Sanjoy Das.

Only a few runtimes implement this interface. Most of them stub out the unwind and get_frame_id function pointers:

I think it also requires at least the reader to proclaim it is GPL via the macro GDB_DECLARE_GPL_COMPATIBLE_READER.

Since I wrote about the perf map interface recently, I have it on my mind. Why can’t we reuse it in GDB?

Adapting to the Linux perf interface

I suppose it would be possible to try and upstream a patch to GDB to support the Linux perf map interface for JITs. After all, why shouldn’t it be able to automatically pick up symbols from /tmp/perf-...? That would be great baseline debug info for “free”.

In the meantime, maybe it is reasonable to create a re-usable custom debug reader:

When registering code, write the address and name to /tmp/perf-... as you normally would
Write the filename as the symfile (does this make /tmp the magic number?)
Have the debug info reader just parse the perf map file

It would be less flexible than both the DWARF and custom readers support: it would only be able to handle filename and code region. No embedding source code for GDB to display in your debugger. But maybe that is okay for a partial solution?

Update: Here is my small attempt at such a plugin.

The n-squared problem

V8 notes in their GDB JIT docs that because the JIT interface is a linked list and we only keep a pointer to the head, we get O(n²) behavior. Bummer. This becomes especially noticeable since they register additional code objects not just for functions, but also trampolines, cache stubs, etc.

Garbage collection

Since GDB expects the code pointer in your symbol object file not to move, you have to make sure to have a stable symbol file pointer and stable executable code pointer. To make this happen, V8 disables its moving GC.

Additionally, if your compiled function gets collected, you have to make sure to unregister the function. Instead of doing this eagerly, ART treats the GDB JIT linked list as a weakref and periodically removes dead code entries from it.