Tail call happens when the subroutine ends with a call to another subroutine. In the higher level language this happens in cases such as:
void g(); void f() { g(); return; }
This is also a tail call, because we just return what another function returns:
int g(int x); int f() { return g(42); }
This is not a tail call: after calling function we have to multiply its result
to another number.
We wait until fact(n-1) completes its execution, and then use its result in
other computations.
Note that in this example the function calls itself rather than some other
function, but this is unimportant.
int fact( int n ) { if (n < 2) return 1; return n * fact(n-1); }
On the assembly level, a tail call corresponds to the following pattern of instructions:
This pair of instructions is located in a function which we are going to call f.
The control reaches this function from some other function, say, f_caller.
Let us follow the state of stack through the execution of f_caller, f and other.
For that we expand this code to include f_caller and other functions.
f_caller: ... call f <next instruction> ; mark 4 ... f: ; mark 1 ... call other ret ; mark 3 ... other: ; mark 2 ... ret
- When we start executing
fthe stack holds the return address insidef_caller(we reach mark 1). - When we call
otherthe stack holds the return addresses forf_caller, thenfon top (we reach mark 2). - The subroutine
otherreturns too, in this moment we have only return address forf_calleron top of the stack (we reach mark 3). - The subroutine
freturns, popping return address off_callerfrom the stack (we reach mark 4).
The last two steps were essentially popping two return addresses from stack consecutively.
It suggests that the first one (the return address to f) is useless and we do not need to store it.
We are indeed right.
The call other instruction is equivalent to the pair of pseudoinstructions:
push rip ; rip is the program counter register jmp other
If we do not need to push return address to f, then we can just substitute this instruction for jmp and get rid of one ret:
f_caller: ... call f <next instruction> ... f: ... jmp other ... other: ... ret ; will return directly to f_caller
The subroutine other becomes essentially the continuation of the subroutine f; we pass from f to other via a simple branching instruction.
Coming back to our example, we can rewrite it like that:
print_char: ... ret print_newline: mov rdi, 0xA ; code jmp print_char