解构汇编代码中的函数

解构汇编代码中的函数
When functions dissolve (2020)

原始链接: https://rubber-duck-typing.com/posts/2020-12-12-when-functions-dissolve.html

## 尾调用优化总结尾调用发生在函数的最后一个动作是调用另一个函数并直接返回其结果时。这与函数在调用函数返回*之后*进行计算的情况不同。重要的是，尾调用提供了优化的潜力。通常，每个函数调用都会在调用栈上添加一个新的返回地址。然而，在尾调用中，被调用函数返回后，当前函数没有进一步的计算需要执行。因此，当前函数的返回地址是不必要的。尾调用可以优化为简单的跳转到下一个函数，而不是将另一个返回地址压入栈中。这有效地将函数调用替换为分支，避免栈增长并提高性能。示例说明了`print_newline`如何可以直接跳转到`print_char`，从而无需从`print_newline`单独返回，并简化执行。这种优化对于递归函数尤其有价值。

黑客新闻新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录当函数消解 (2020) (rubber-duck-typing.com) 4 点由 vitalnodo 1小时前 | 隐藏 | 过去 | 收藏 | 讨论指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系搜索：

原文

Tail call happens when the subroutine ends with a call to another subroutine. In the higher level language this happens in cases such as:

void g();

void f() {
    g();
    return;
}

This is also a tail call, because we just return what another function returns:

int g(int x);

int f() {
    return g(42);
}

This is not a tail call: after calling function we have to multiply its result to another number. We wait until fact(n-1) completes its execution, and then use its result in other computations. Note that in this example the function calls itself rather than some other function, but this is unimportant.

int fact( int n ) {
  if (n < 2) return 1;
  return n * fact(n-1);
}

On the assembly level, a tail call corresponds to the following pattern of instructions:

This pair of instructions is located in a function which we are going to call f. The control reaches this function from some other function, say, f_caller.

Let us follow the state of stack through the execution of f_caller, f and other. For that we expand this code to include f_caller and other functions.

f_caller:

...
   call f
   <next instruction> ; mark 4
...

f:            ; mark 1
  ...
  call other
  ret         ; mark 3
...
other:        ; mark 2
   ...
   ret

When we start executing f the stack holds the return address inside f_caller (we reach mark 1).
When we call other the stack holds the return addresses for f_caller, then f on top (we reach mark 2).
The subroutine other returns too, in this moment we have only return address for f_caller on top of the stack (we reach mark 3).
The subroutine f returns, popping return address of f_caller from the stack (we reach mark 4).

The last two steps were essentially popping two return addresses from stack consecutively. It suggests that the first one (the return address to f) is useless and we do not need to store it. We are indeed right.

The call other instruction is equivalent to the pair of pseudoinstructions:

push rip    ; rip is the program counter register
jmp other

If we do not need to push return address to f, then we can just substitute this instruction for jmp and get rid of one ret:

f_caller:

...
   call f
   <next instruction>
...

f:
  ...
  jmp other
...
other:
   ...
   ret     ; will return directly to f_caller

The subroutine other becomes essentially the continuation of the subroutine f; we pass from f to other via a simple branching instruction.

Coming back to our example, we can rewrite it like that:

print_char:
   ...
   ret

print_newline:
    mov rdi, 0xA          ; code
    jmp print_char

解构汇编代码中的函数 When functions dissolve (2020)

解构汇编代码中的函数
When functions dissolve (2020)