逆向工程——printf()

[TOC]

程序

三个参数

1
2
3
4
5
6
7
#include <stdio.h>

int main()
{
printf("a=%d; b=%d; c=%d", 1, 2, 3);
return 0;
}

九个参数

1
2
3
4
5
6
7
8
#include <stdio.h>

int main()
{
printf("a=%d; b=%d; c=%d; d=%d; e=%d; f=%d; g=%d; h=%d\n", 1, 2, 3, 4, 5, 6, 7, 8);
return 0;
};

我们用这个程序来演示它的参数传递过程。

x86

x86:传递3个参数

MSVC

使用 MSVC 2010 express 编译上述程序,可得到下列汇编指令:

1
2
3
4
5
6
7
8
9
$SG3830 DB		'a=%d; b=%d; c=%d’, 00H
...
push 3
push 2
push 1
push OFFSET $SG3830
call _printf
add esp, 16 ;00000010H

这与最初的 Hello World 程序相差不多。我们看到 printf()函数的参数以逆序存入栈里,第一个参数在最后入栈。
在 32 位环境下,32 位地址指针和 int 类型数据都占据 32 位/4 字节空间。所以,我们这里的四个参数 总共占用 4×4=16(字节)的存储空间。
在调用函数之后,ADD ESP, X指令修正ESP寄存器中的栈指针。通常情况下,我们可以通过call 之后的这条指令判断参数的数量:变量总数=X÷4
这种判断方法仅适用于调用约定为cdecl的程序。

如果某个程序连续地调用多个函数,且调用函数的指令之间不夹杂其他指令,那么编译器可能把释放参数存储空间的ADD ESP,X指令进行合并,一次性地释放所有空间。例如:

1
2
3
4
5
6
7
8
9
10
11
12
push a1 
push a2
call ...
...
push a1
call ...
...
push a1
push a2
push a3
call ...
add esp, 24
使用OllyDBG调试

将MSVC编译成的文件放进OD,并通过设置断点的方式,我们可以来到main()函数处(此前需要跳过多处ntdll、crt等代码)。main()函数如图:
屏幕快照 2017-12-04 下午12.13.00.png

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
004115D0    55              push ebp
004115D1 8BEC mov ebp,esp
004115D3 83EC 40 sub esp,0x40
004115D6 53 push ebx
004115D7 56 push esi
004115D8 57 push edi
004115D9 6A 03 push 0x3
004115DB 6A 02 push 0x2
004115DD 6A 01 push 0x1
004115DF 68 305B4100 push ConsoleA.00415B30 ; ASCII "a=%d, a=%d, c=%d"
004115E4 E8 83FCFFFF call ConsoleA.0041126C
004115E9 83C4 10 add esp,0x10
004115EC 33C0 xor eax,eax ; ucrtbase.__argc
004115EE 5F pop edi ; ConsoleA.00411A5E
004115EF 5E pop esi ; ConsoleA.00411A5E
004115F0 5B pop ebx ; ConsoleA.00411A5E
004115F1 8BE5 mov esp,ebp
004115F3 5D pop ebp ; ConsoleA.00411A5E

x64

x64:传递3个参数

使用radare2调试

使用gcc编译源代码,之后使用radare2进行调试,结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
root@kali:~/re# r2 a.out 
[0x00000540]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze len bytes of instructions for references (aar)
[x] Analyze function calls (aac)
[x] Use -AA or aaaa to perform additional experimental analysis.
[x] Constructing a function name for fcn.* and sym.func.* functions (aan)
[0x00000540]> s main
[0x0000064a]> pdf
;-- main:
/ (fcn) sym.main 43
| sym.main ();
| ; DATA XREF from 0x0000055d (entry0)
| 0x0000064a 55 push rbp
| 0x0000064b 4889e5 mov rbp, rsp
| 0x0000064e b903000000 mov ecx, 3
| 0x00000653 ba02000000 mov edx, 2
| 0x00000658 be01000000 mov esi, 1
| 0x0000065d 488d3da00000. lea rdi, qword str.a__d__b__d__c__d_n ; 0x704 ; "a=%d, b=%d, c=%d\n"
| 0x00000664 b800000000 mov eax, 0
| 0x00000669 e8b2feffff call sym.imp.printf ; int printf(const char *format)
| 0x0000066e b800000000 mov eax, 0
| 0x00000673 5d pop rbp
\ 0x00000674 c3 ret
[0x0000064a]>

x64:传递9个参数

GCC编译,使用radare2调试:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
root@kali:~/re# ./a.out 
a=1; b=2; c=3; d=4; e=5; f=6; g=7; h=8
root@kali:~/re# r2 ./a.out
[0x00000540]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze len bytes of instructions for references (aar)
[x] Analyze function calls (aac)
[x] Use -AA or aaaa to perform additional experimental analysis.
[x] Constructing a function name for fcn.* and sym.func.* functions (aan)
[0x00000540]> s main
[0x0000064a]> pdf
;-- main:
/ (fcn) sym.main 69
| sym.main ();
| ; DATA XREF from 0x0000055d (entry0)
| 0x0000064a 55 push rbp
| 0x0000064b 4889e5 mov rbp, rsp
| 0x0000064e 4883ec08 sub rsp, 8
| 0x00000652 6a08 push 8
| 0x00000654 6a07 push 7
| 0x00000656 6a06 push 6
| 0x00000658 41b905000000 mov r9d, 5
| 0x0000065e 41b804000000 mov r8d, 4
| 0x00000664 b903000000 mov ecx, 3
| 0x00000669 ba02000000 mov edx, 2
| 0x0000066e be01000000 mov esi, 1
| 0x00000673 488d3d9e0000. lea rdi, qword str.a__d__b__d__c__d__d__d__e__d__f__d__g__d__h__d_n ; 0x718 ; "a=%d; b=%d; c=%d; d=%d; e=%d; f=%d; g=%d; h=%d\n"
| 0x0000067a b800000000 mov eax, 0
| 0x0000067f e89cfeffff call sym.imp.printf ; int printf(const char *format)
| 0x00000684 4883c420 add rsp, 0x20
| 0x00000688 b800000000 mov eax, 0
| 0x0000068d c9 leave
\ 0x0000068e c3 ret
[0x0000064a]>

其他系统

关于ARM、MIPS系统对于printf()函数的处理,请各位同学参考《RE4B》一书。有能力的同学可自行编译分析。

总结

调用函数的时候,程序参数的传递过程大体如下:

x86

1
2
3
4
5
6
...
push 3rd argument
push 2nd argument
push 1st argument
call function
; modify stack pointer if needed

x64 - msvc

1
2
3
4
5
6
7
8
mov	rcx,	1st argument
mov rdx, 2nd argument
mov r8, 3rd argument
mov r9, 4th argument
...
push 5th, 6th argument, etc
call function
; modify stack pointer if needed

x64 - gcc

1
2
3
4
5
6
7
8
9
10
mov	rdi, 1st argument
mov rsi, 2nd argument
mov rdx, 3rd argument
mov rcx, 4th argument
mov r8, 5th argument
mov r9, 6th argument
...
push 7th, 8th argument, etc
call function
; modify stack pointer if needed

ARM

1
2
3
4
5
6
7
mov	r0, 1st argument
mov r1, 2nd argument
mov r2, 3rd argument
mov r3, 4th argument
; pass 5th, 6th argument, etc
bl function
; modify stack pointer if needed, IN STACK

ARM64

1
2
3
4
5
6
7
8
9
10
mov	x0, 1st argument
mov x1, 2nd argument
mov x2, 3rd argument
mov x3, 4th argument
mov x4, 5th argument
mov x5, 6th argument
mov x6, 7th argument
; pass 9th, 10th argument, etc, IN STACK
bl function
; modify stack pointer if needed

MIPS - O32调用约定

1
2
3
4
5
6
7
li	$4, 1st  argument		;$a0
li $5, 2nd argument ;$a1
li $6, 3rd argument ;$a2
li $7, 4th argument ;$a3
; pass 5th, 6th argument, etc, IN STACK
lw temp_reg, address of function
jalr temp_reg

作业

选择一个源代码,在gcc或msvc上编译,并用ollydbgradare2gdbIDA进行分析。