reveal details of program, such as type cast, usage of register and memory.
register overview
Notes | 64-bit long | 32-bit int | 16-bit short | 8-bit char |
---|---|---|---|---|
Values are returned from functions in this register. Multiply instructions put the low bits of the result here too. | rax | eax | ax | ah and al |
Typical scratch register. Some instructions use it as a counter (such as SAL or REP). | rcx | ecx | cx | ch and cl |
Scratch register. Multiply instructions put the high bits of the result here. | rdx | edx | dx | dh and dl |
Preserved register: don’t use it without saving it! | rbx | ebx | bx | bh and bl |
The stack pointer. Points to the top of the stack | rsp | esp | sp | spl |
Preserved register. Sometimes used to store the old value of the stack pointer, or the “base”. | rbp | ebp | bp | bpl |
Scratch register. Also used to pass function argument #2 in 64-bit mode (on Linux). | rsi | esi | si | sil |
Scratch register. Function argument #1. | rdi | edi | di | dil |
Scratch register. These were added in 64-bit mode, so the names are slightly different. | r8 | r8d | r8w | r8b |
Scratch register. | r9 | r9d | r9w | r9b |
Scratch register. | r10 | r10d | r10w | r10b |
Scratch register. | r11 | r11d | r11w | r11b |
Preserved register. | r12 | r12d | r12w | r12b |
Preserved register. | r13 | r13d | r13w | r13b |
Preserved register. | r14 | r14d | r14w | r14b |
Preserved register. | r15 | r15d | r15w | r15b |
Like C++ variables, registers are actually available in several sizes:
rax is the 64-bit, “long” size register. It was added in 2003 during the transition to 64-bit processors. eax is the 32-bit, “int” size register. It was added in 1985 during the transition to 32-bit processors with the 80386 CPU. I’m in the habit of using this register size, since they also work in 32 bit mode, although I’m trying to use the longer rax registers for everything. ax is the 16-bit, “short” size register. It was added in 1979 with the 8086 CPU, but is used in DOS or BIOS code to this day. al and ah are the 8-bit, “char” size registers. al is the low 8 bits, ah is the high 8 bits. They’re pretty similar to the old 8-bit registers of the 8008 back in 1972. Curiously, you can write a 64-bit value into rax, then read off the low 32 bits from eax, or the low 16 bitx from ax, or the low 8 bits from al–it’s just one register, but they keep on extending it!
some interesting behavior when write to register: | Source Size | | | | | | |————-|————-|—————-|————–|————–|——————————————-| | | 64 bit rcx | 32 bit ecx | 16 bit cx | 8 bit cl | Notes | | 64 bit rax | mov rax,rcx | movsxd rax,ecx | movsx rax,cx | movsx rax,cl | Writes to whole register | | 32 bit eax | mov eax,ecx | mov eax,ecx | movsx eax,cx | movsx eax,cl | Top half of destination gets zeroed | | 16 bit ax | mov ax,cx | mov ax,cx | mov ax,cx | movsx ax,cl | Only affects low 16 bits, rest unchanged. | | 8 bit al | mov al,cl | mov al,cl | mov al,cl | mov al,cl | Only affects low 8 bits, rest unchanged. |
0x1122334455667788
================ rax (64 bits)
======== eax (32 bits)
==== ax (16 bits)
== ah (8 bits)
== al (8 bits)
when write, for register under 32bits, only visit the part of the register
mov eax, 0x11112222 ; eax = 0x11112222
mov ax, 0x3333 ; eax = 0x11113333 (works, only low 16 bits changed)
mov al, 0x44 ; eax = 0x11113344 (works, only low 8 bits changed)
mov ah, 0x55 ; eax = 0x11115544 (works, only high 8 bits changed)
xor ah, ah ; eax = 0x11110044 (works, only high 8 bits cleared)
mov eax, 0x11112222 ; eax = 0x11112222
xor al, al ; eax = 0x11112200 (works, only low 8 bits cleared)
mov eax, 0x11112222 ; eax = 0x11112222
xor ax, ax ; eax = 0x11110000 (works, only low 16 bits cleared)
```However, when write to 64 bit register, thing is very interesting
```asm
mov rax, 0x1111222233334444 ; rax = 0x1111222233334444
mov eax, 0x55556666 ; actual: rax = 0x0000000055556666
; expected: rax = 0x1111222255556666
; upper 32 bits seem to be lost!
mov rax, 0x1111222233334444 ; rax = 0x1111222233334444
mov ax, 0x7777 ; rax = 0x1111222233337777 (works!)
mov rax, 0x1111222233334444 ; rax = 0x1111222233334444
xor eax, eax ; actual: rax = 0x0000000000000000
; expected: rax = 0x1111222200000000
; again, it wiped whole register
when write from 32bit register to 64 bits register, the upper 32bit is zero-out.
register %eax
this register is where the value stored gonna C source –>
macro
eax is a 32-bit general-purpose register with two common uses: to store the return value of a function and as a special register for certain calculations
int main(){
// return 4;
return 5;
}
-O1
main:
mov eax,0x5
ret
cs nop WORD PTR [rax+rax*1+0x0]
-O0
main:
push rbp
mov rbp,rsp
mov eax,0x5
pop rbp
ret
cs nop WORD PTR [rax+rax*1+0x0]
nop DWORD PTR [rax+rax*1+0x0]
register %lea
struct Point
{
int xcoord;
int ycoord;
};
for
int y = points[i].ycoord;
asm is:
MOV EDX, [EBX + 8*EAX + 4] ; right side is "effective address"
for
int *p = &points[i].ycoord;
asm is:
LEA ESI, [EBX + 8*EAX + 4]
In this case, you don’t want the value of ycoord, but its address. That’s where LEA (load effective address) comes in. Instead of a MOV, the compiler can generate cited from What’s the purpose of the LEA instruction?
LEA, the only instruction that performs memory addressing calculations but doesn’t actually address memory. LEA accepts a standard memory addressing operand, but does nothing more than store the calculated memory offset in the specified register, which may be any general purpose register.
member function call and function call are actually has similar asm generated
https://www.godbolt.org/z/ExE6rvKv1
picture:
interesting: $rdi works like a register for function parameter, in this case, the this is stored here
C++ storage size
QWORD : 64bits, pointer DWORD : 32bits, int WORD: 16bits, short
.text # start of code indicator.
.globl _main # make the main function visible to the outside.
_main: # actually label this spot as the start of our main function.
push %rbp # save the base pointer to the stack.
mov %rsp, %rbp # put the previous stack pointer into the base pointer.
subl $8, %esp # Balance the stack onto a 16-byte boundary.
movl $0, %eax # Stuff 0 into EAX, which is where result values go.
leave # leave cleans up base and stack pointers again.
ret
- in x86, stack grow downward.