Paul
Wayper
writes about trampolines. This is something I've come across before,
when I was learning about function pointers with IA64.
Nested functions are a strange contraption, and probably best avoided
(people who should know
agree). For example, here is a
completely obfuscated way to double a number.
#include <stdio.h>
int function(int arg) {
int nested_function(int nested_arg) {
return arg + nested_arg;
}
return nested_function(arg);
}
int main(void)
{
printf("%d\n", function(10));
}
Let's disassemble that on IA64 (because it is so much easier to
understand).
0000000000000000 :
0: 00 10 15 08 80 05 [MII] alloc r34=ar.pfs,5,4,0
6: c0 80 33 7e 46 60 adds r12=-16,r12
c: 04 08 00 84 mov r35=r1
10: 01 00 00 00 01 00 [MII] nop.m 0x0
16: 10 02 00 62 00 80 mov r33=b0
1c: 04 00 01 84 mov r36=r32;;
20: 0a 70 40 18 00 21 [MMI] adds r14=16,r12;;
26: 00 00 39 20 23 e0 st4 [r14]=r32
2c: 01 70 00 84 mov r15=r14
30: 10 00 00 00 01 00 [MIB] nop.m 0x0
36: 00 00 00 02 00 00 nop.i 0x0
3c: 48 00 00 50 br.call.sptk.many b0=70
40: 0a 08 00 46 00 21 [MMI] mov r1=r35;;
46: 00 00 00 02 00 00 nop.m 0x0
4c: 20 02 aa 00 mov.i ar.pfs=r34
50: 00 00 00 00 01 00 [MII] nop.m 0x0
56: 00 08 05 80 03 80 mov b0=r33
5c: 01 61 00 84 adds r12=16,r12
60: 11 00 00 00 01 00 [MIB] nop.m 0x0
66: 00 00 00 02 00 80 nop.i 0x0
6c: 08 00 84 00 br.ret.sptk.many b0;;
0000000000000070 :
70: 0a 40 00 1e 10 10 [MMI] ld4 r8=[r15];;
76: 80 00 21 00 40 00 add r8=r32,r8
7c: 00 00 04 00 nop.i 0x0
80: 11 00 00 00 01 00 [MIB] nop.m 0x0
86: 00 00 00 02 00 80 nop.i 0x0
8c: 08 00 84 00 br.ret.sptk.many b0;;
Note that nothing funny happens there at all. The nested function looks
exactly like a real function, but we don't do any of the alloc or
anything to get us a new stack frame -- we're using the stack frame of
the old function. No trampoline involved, just a straight branch.
So why do we need a trampoline? Because a nested function has an extra
property over a normal function; the stack pointer must be set to be the
stack pointer of the function it is nested within. This means that if we
were to pass around pointers to the nested function, we would have to
keep track that this isn't a pointer to just any old function, but a
pointer to special function that needs its stack setup to come from the
nested function. C doesn't work like this, there is only one type
"pointer to function".
The easiest way is therefore to create a little function that sets the
stack to the right place and calls the code. This "trampoline" is a
normal function, so no need for trickery. For example, we can modify
this to use a function pointer to the nested function:
#include <stdio.h>
int function(int arg) {
int nested_function(int nested_arg) {
return arg + nested_arg;
}
int (*function_pointer)(int arg) = nested_function;
return function_pointer(arg);
}
int main(void)
{
printf("%d\n", function(10));
}
We can see that now we are calling through a function pointer, we go
through __ia64_trampoline, which is a little code stub in
libgcc.a which loads up the right stack pointer and calls the
"nested" function.
function:
.prologue 12, 33
.mmb
.save ar.pfs, r34
alloc r34 = ar.pfs, 1, 3, 1, 0
.fframe 48
adds r12 = -48, r12
nop 0
.mmi
addl r15 = @ltoffx(__ia64_trampoline#), r1
mov r35 = r1
.save rp, r33
mov r33 = b0
.body
;;
.mmi
nop 0
adds r8 = 24, r12
adds r14 = 16, r12
.mmi
ld8.mov r15 = [r15], __ia64_trampoline#
;;
st4 [r14] = r32
nop 0
.mmi
mov r14 = r8
;;
st8 [r14] = r15, 8
adds r15 = 40, r12
;;
.mfi
st8 [r14] = r15, 8
nop 0
addl r15 = @ltoff(@fptr(nested_function.1814#)), gp
;;
.mmi
ld8 r15 = [r15]
;;
st8 [r14] = r15, 8
adds r15 = 16, r12
;;
.mmb
st8 [r14] = r15
ld8 r14 = [r8], 8
nop 0
.mmi
ld4 r36 = [r15]
;;
nop 0
mov b6 = r14
.mbb
ld8 r1 = [r8]
nop 0
br.call.sptk.many b0 = b6
;;
.mii
mov r1 = r35
mov ar.pfs = r34
mov b0 = r33
.mib
nop 0
.restore sp
adds r12 = 48, r12
br.ret.sptk.many b0
.endp function#
.section .rodata.str1.8,"aMS",@progbits,1
.align 8
So, the summary is that if you're taking the address of a nested
function, to avoid having different types of pointers to functions if
they are nested or not gcc puts in the stub "trampoline" code.