第一个条件转一会领是JLE,即“Junp if Less or Equal”。如果上一条CMP指令的第一个操作数表达式小于或等于(不大于)第二个表达式,JLE将跳转到指令所标明的地址;如果不满足上述条件,则运行下一条指令,就本例而言程序将会调用printf()函数,第二个条件转移指令是JNE,”Jump if Not Equal“,如果上一条CMP指令的两个操作符不相等,则进行相应跳转。 第三个转移指令是JGE,即”Jump if Greater or Equal“,如果CMP的第一个表达式大于或等于第二个表达式(不小于),则进行跳转。这段程序里,如果三个跳转的判断条件都不满足,将不会调用pringtf()函数;不过除非进行特殊干预,,否则这种情况应该不会发生。 现在我们观察 f_unsigned()函数的汇编指令。f_unsigned()函数和 f_signed()函数大体相同。它们的区别集中体现在条件转移指令上:f_unsinged()函数的使用的条件转移指令是 JBE 和 JAE,而 f_signed()函数使用的条件转移指令则是 JLE 和 JGE。 使用 GCC 编译上述程序,可得到 f_unsigned()的汇编指令如下。
int my_abs (int i) { if (i<0) return -i; else return i; };
Optimizing MSVC
指令清单 Optimizing MSVC 2012 x64
1 2 3 4 5 6 7 8 9 10 11 12 13 14
i$ = 8 my_abs PROC ; ECX = input test ecx, ecx ; check for sign of input value ; skip NEG instruction if sign is positive jns SHORT $LN2@my_abs ; negate value neg ecx $LN2@my_abs: ; prepare result in EAX: mov eax,ecx ret 0 my_abs ENDP
GCC 4.9 的编译结果几乎相同。
Optimizing Keil 6/2013: Thumb mode
指令清单 Optimizing Keil 6/2013:Thumb mode
1 2 3 4 5 6 7 8 9 10
my_abs PROC CMP r0,#0 ; is input value equal to zero or greater than zero? ; skip RSBS instruction then BGE |L0.6| ; subtract input value from 0: RSBS r0,r0,#0 |L0.6| BX lr ENDP
ARM 平台没有负数运算指令,所以 Keil 编译器使用了“零减去数值”的减法运算指令“Reverse Subtract”(减数和被减数位置对调的减法运算),同样达到了替换符号的效果。
Optimizing Keil 6/2013: ARM mode
因为 ARM 模式的指令集存在条件执行指令,所以开启优化选项后可得到如下指令。 指令清单 Optimizing Keil 6/2013:ARM mode
1 2 3 4 5 6 7
my_abs PROC CMP r0,#0 ; execute "Reverse Subtract" instruction only if input value is less than 0: RSBLT r0,r0,#0 BX lr ENDP
my_abs: subs p, sp, #16 str w0, [sp,12] ldr w0, [sp,12] ; compare input value with contents of WZR register ; (which always holds zero) cmp w0, wzr bge .L2 ldr w0, [sp,12] neg w0, w0 b .L3 .L2: ldr w0, [sp,12] .L3: add sp, sp, 16 ret
MIPS
指令清单 Optimizing GCC 4.4.5 (IDA)
1 2 3 4 5 6 7 8 9 10 11 12
my_abs: ; jump if $a0<0: bltz $a0, locret_10 ; just return input value ($a0) in $v0: move $v0, $a0 jr $ra or $at, $zero ; branch delay slot, NOP locret_10: ; negate input value and store it in $v0: jr $ra ; this is pseudoinstruction. in fact, this is "subu $v0,$zero,$a0" ($v0=0-$a0) negu $v0, $a0
这里出现了新指令 BLTZ(Branch if Less Than Zero),以及伪指令 NEGU。NEGU 指令计算零减去操作数的差。SUBU 和 NEGU 指令中的后缀 U 代表它的操作数是无符号型数据,并且在整数溢出的情况下不会触发异常处理机制。
条件运算符
程序
1 2 3 4
const char* f (int a) { return a==10 ? "it is ten" : "it is not ten"; };
$SG746 DB 'it is ten', 00H $SG747 DB 'it is not ten', 00H
tv65 = -4 ; this will be used as a temporary variable _a$ = 8 _f PROC push ebp mov ebp, esp push ecx ; compare input value with 10 cmp DWORD PTR _a$[ebp], 10 ; jump to $LN3@f if not equal jne SHORT $LN3@f ; store pointer to the string into temporary variable: mov DWORD PTR tv65[ebp], OFFSET $SG746 ; 'it is ten' ; jump to exit jmp SHORT $LN4@f $LN3@f: ; store pointer to the string into temporary variable: mov DWORD PTR tv65[ebp], OFFSET $SG747 ; 'it is not ten' $LN4@f: ; this is exit. copy pointer to the string from temporary variable to EAX. mov eax, DWORD PTR tv65[ebp] mov esp, ebp pop ebp ret 0 _f ENDP Optimizing MSVC 2008
指令清单 Optimizing MSVC 2008
1 2 3 4 5 6 7 8 9 10 11 12 13
$SG792 DB 'it is ten', 00H $SG792 DB 'it is not ten', 00H _a$ = 8 ; size = 4 _f PROC ; compare input value with 10 cmp DWORD PTR _a$[esp-4], 10 mov eax, OFFSET $SG792 ; 'it is ten' ; jump to $LN4@f if equal je SHORT $LN4@f mov eax, OFFSET $SG793 ; 'it is not ten' $LN4@f: ret 0 _f ENDP
新编译器生成的程序更为简洁。 指令清单 Optimizing MSVC 2012 x64
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
$SG1355 DB 'it is ten', 00H $SG1356 DB 'it is not ten', 00H
a$ = 8 f PROC ; load pointers to the both strings lea rdx, OFFSET FLAT:$SG1355 ; 'it is ten' lea rax, OFFSET FLAT:$SG1356 ; 'it is not ten' ; compare input value with 10 cmp ecx, 10 ; if equal, copy value from RDX ("it is ten") ; if not, do nothing. pointer to the string "it is not ten" is still in RAX as for now. cmove rax, rdx ret 0 f ENDP
启用优化选项后,GCC 4.8 生成的 x86 指令同样使用了 CMOVcc 指令。相比之下,在关闭优化功能的情况下,GCC 4.8 用条件转移指令编译条件操作符。
ARM
启用优化功能之后,Keil 生成的 ARM 代码会应用条件运行指令 ADRcc
1 2 3 4 5 6 7 8 9 10 11 12 13
f PROC ; compare input value with 10 CMP r0, #0xa ; if comparison result is EQual, copy pointer to the "it is ten" string into R0 ADREQ r0,|L0.16| ; "it is ten" ; if comparison result is Not Equal, copy pointer to the "it is not ten" string into R0 ADRNE r0,|L0.28| ; "it is not ten" BX lr ENDP |L0.16| DCB "it is ten",0 |L0.28| DCB "it is not ten",0
除非存在人为干预,否则 ADREQ 和 ADRNE 指令不可能在同一次调用期间都被执行。 在启用优化功能之后,Keil 会给编译出的 Thumb 模式代码分配条件转移指令。毕竟在 Thumb 模式的指令之中,没有支持标志位判断的赋值指令。 指令清单 Optimizing Keil 6/2013 (Thumb mode)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
f PROC ; compare input value with 10 CMP r0,#0xa ; jump to |L0.8| if EQual BEQ |L0.8| ADR r0,|L0.12| ; "it is not ten" BX lr |L0.8| ADR r0,|L0.28| ; "it is ten" BX lr ENDP |L0.12| DCB "it is not ten",0 |L0.28| DCB "it is ten",0
$LC0: .ascii "it is not ten\000" $LC1: .ascii "it is ten\000" f: li $2,10 # 0xa ; compare $a0 and 10, jump if equal: beq $4,$2,$L2 nop ; branch delay slot
; leave address of "it is not ten" string in $v0 and return: lui $2,%hi($LC0) j $31 addiu $2,$2,%lo($LC0) $L2: ; leave address of "it is ten" string in $v0 and return: lui $2,%hi($LC1) j $31 addiu $2,$2,%lo($LC1)
使用 if/else 替代条件运算符
1 2 3 4 5 6 7
const char* f (int a) { if (a==10) return "it is ten"; else return "it is not ten"; };
.LC0: .string "it is ten" .LC1: .string "it is not ten" f: .LFB0: ; compare input value with 10 cmp DWORD PTR [esp+4], 10 mov edx, OFFSET FLAT:.LC1 ; "it is not ten" mov eax, OFFSET FLAT:.LC0 ; "it is ten" ; if comparison result is Not Equal, copy EDX value to EAX ; if not, do nothing cmovne eax, edx ret
总结
启用优化功能之后,编译器会尽可能地避免使用条件转移指令。
比较最大值和最小值
32位
程序
1 2 3 4 5 6 7 8 9 10 11 12 13 14
int my_max(int a, int b) { if (a>b) return a; else return b; }; int my_min(int a, int b) { if (a<b) return a; else return b; };
_a$ = 8 _b$ = 12 _my_min PROC push ebp mov ebp, esp mov eax, DWORD PTR _a$[ebp] ; compare A and B: cmp eax, DWORD PTR _b$[ebp] ; jump, if A is greater or equal to B: jge SHORT $LN2@my_min ; reload A to EAX if otherwise and jump to exit mov eax, DWORD PTR _a$[ebp] jmp SHORT $LN3@my_min jmp SHORT $LN3@my_min ; this is redundant JMP $LN2@my_min: ; return B mov eax, DWORD PTR _b$[ebp] $LN3@my_min: pop ebp ret 0 _my_min ENDP _a$ = 8 _b$ = 12 _my_max PROC push ebp mov ebp, esp mov eax, DWORD PTR _a$[ebp] ; compare A and B: cmp eax, DWORD PTR _b$[ebp] ; jump if A is less or equal to B: jle SHORT $LN2@my_max ; reload A to EAX if otherwise and jump to exit mov eax, DWORD PTR _a$[ebp] jmp SHORT $LN3@my_max jmp SHORT $LN3@my_max ; this is redundant JMP $LN2@my_max: ; return B mov eax, DWORD PTR _b$[ebp] $LN3@my_max: pop ebp ret 0 _my_max ENDP
两个函数的唯一区别就是条件转移指令:第一个函数使用的是 JGE(Jump if Greater or Equal),而第二个函数使用的是 JLE(Jump if Less or Equal)。上述每个函数里都存在一个多余的 JMP 指令。这可能是 MSVC 的问题。
my_max PROC ; R0=A ; R1=B ; compare A and B: CMP r0,r1 ; branch if A is greater then B: BGT |L0.6| ; otherwise (A<=B) return R1 (B): MOVS r0,r1 |L0.6| ; return BX lr ENDP my_min PROC ; R0=A ; R1=B ; compare A and B: CMP r0,r1 ; branch if A is less then B: BLT |L0.14| ; otherwise (A>=B) return R1 (B): MOVS r0,r1 |L0.14| ; return BX lr ENDP
两个函数所用的转移指令不同:一个是 BGT,而另一个是 BLT。 在编译 ARM 模式程序时,编译器可能会使用条件执行指令(即“有分支”指令) 。这种程序会显得更为短小。在编译条件表达式时,Keil 编译器使用了 MOVcc 指令。
my_max PROC ; R0=A ; R1=B ; compare A and B: CMP r0,r1 ; return B instead of A by placing B in R0 ; this instruction will trigger only if A<=B (hence, LE - Less or Equal) ; if instruction is not triggered (in case of A>B), A is still in R0 register MOVLE r0,r1 BX lr ENDP my_min PROC ; R0=A ; R1=B ; compare A and B: CMP r0,r1 ; return B instead of A by placing B in R0 ; this instruction will trigger only if A>=B (hence, GE - Greater or Equal) ; if instruction is not triggered (in case of A<B), A value is still in R0 register MOVGE r0,r1 BX lr ENDP
my_max: mov edx, DWORD PTR [esp+4] mov eax, DWORD PTR [esp+8] ; EDX=A ; EAX=B ; compare A and B: cmp edx, eax ; if A>=B, load A value into EAX ; the instruction idle if otherwise (if A<B) cmovge eax, edx ret my_min: mov edx, DWORD PTR [esp+4] mov eax, DWORD PTR [esp+8] ; EDX=A ; EAX=B ; compare A and B: cmp edx, eax ; if A<=B, load A value into EAX ; the instruction idle if otherwise (if A>B) cmovle eax, edx ret
64 位
程序
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#include <stdint.h> int64_t my_max(int64_t a, int64_t b) { if (a>b) return a; else return b; }; int64_t my_min(int64_t a, int64_t b) { if (a<b) return a; else return b; };
my_max: ; RDI=A ; RSI=B ; compare A and B: cmp rdi, rsi ; prepare B in RAX for return: mov rax, rsi ; if A>=B, put A (RDI) in RAX for return. ; this instruction is idle if otherwise (if A<B) cmovge rax, rdi ret my_min: ; RDI=A ; RSI=B ; compare A and B: cmp rdi, rsi ; prepare B in RAX for return: mov rax, rsi ; if A<=B, put A (RDI) in RAX for return. ; this instruction is idle if otherwise (if A>B) cmovle rax, rdi ret
MSVC 2013 的编译方法几乎一样。ARM64 指令集里有 CSEL 指令。它相当于 ARM 指令集中的 MOVcc 指令,以及 x86 平台的 CMOVcc指令。它只是名字不同:“Conditional SELect”。
指令清单 Optimizing GCC 4.9.1 ARM64
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
my_max: ; X0=A ; X1=B ; compare A and B: cmp x0, x1 ; select X0 (A) to X0 if X0>=X1 or A>=B (Greater or Equal) ; select X1 (B) to X0 if A<B csel x0, x0, x1, ge ret my_min: ; X0=A ; X1=B ; compare A and B: cmp x0, x1 ; select X0 (A) to X0 if X0<=X1 or A<=B (Less or Equal) ; select X1 (B) to X0 if A>B csel x0, x0, x1, le ret
my_max: ; set $v1 $a1<$a0,or clear otherwise (if $01>$a0): slt $v1, $a1, $a0 ; jump, if $v1 iso (or $a1>$a9): beqz $v1, locret_10 ; this is branch delay slot ; prepare $a1 in $v0 in case of branch triggered: move $v0, $a1 ; no branch triggered, prepare $a0 in $v0: move $v0, $a0 locret_10: jr $ra or $at, $zero ; branch delay slot, NOP ; the min() function is same, but input operands in SLT instruction are swapped: my_min slt $v1, $a0, $a1 beqz $v1, locret_28 move $v0, $a1 move $v0, $a0 locret_28: jr $ra or $at, $zero ; branch delay slot, NOP
CMP register, register/value Jcc true ; cc=condition code false: ... some code to be executed if comparison result is false ... JMP exit true: ... some code to be executed if comparison result is true ... exit:
ARM
指令清单 ARM
1 2 3 4 5 6 7 8
CMP register, register/value Bcc true ; cc=condition code false: ... some code to be executed if comparison result is false ... JMP exit true: ... some code to be executed if comparison result is true ... exit:
MIPS
指令清单 遇零跳转
1 2
BEQZ REG, label ...
指令清单 遇负数跳转
1 2
BLTZ REG, label ...
指令清单 值相等的情况下跳转
1 2
BEQ REG1, REG2, label ...
指令清单 值不等的情况下跳转
1 2
BNE REG1, REG2, label ...
指令清单 第一个值小于第二个值的情况下跳转(signed)
1 2 3
SLT REG1, REG2, REG3 BEQ REG1, label ...
指令清单 第一个值小于第二个值的情况下跳转(unsigned)
1 2 3
SLTU REG1, REG2, REG3 BEQ REG1, label ...
无分支指令(非条件指令)
如果条件语句十分短,那么编译器可能会分配条件执行指令:
编译 ARM 模式的程序时应用 MOVcc 指令。
编译 ARM64 程序时应用 CSEL 指令。
编译 x86 程序时应用 CMOVcc 指令。
####ARM 在编译 ARM 模式的程序时,编译器可能用条件执行指令替代条件转移指令。 指令清单 ARM (ARM mode)
1 2 3 4
CMP register, register/value instr1_cc ; some instruction will be executed if condition code is true instr2_cc ; some other instruction will be executed if other condition code is true ... etc ...
在被执行指令不修改任何标志位的情况下,程序可有任意多条的条件执行指令。 Thumb 模式的指令集里有 IT 指令。它可以把后续四条指令构成一个指令组,并且在条件表达式为真的时候运行这组指令。
指令清单 ARM (Thumb mode)
1 2 3 4 5 6
CMP register, register/value ITEEE EQ ; set these suffixes: if-then-else-else-else instr1 ;instraction will be executed if condition is true instr2 ;instraction will be executed if condition is false instr3 ;instraction will be executed if condition is false instr4 ;instraction will be executed if condition is false