reveal details of program, such as type cast, usage of register and memory.

register overview

Notes 64-bit long 32-bit int 16-bit short 8-bit char
Values are returned from functions in this register. Multiply instructions put the low bits of the result here too. rax eax ax ah and al
Typical scratch register. Some instructions use it as a counter (such as SAL or REP). rcx ecx cx ch and cl
Scratch register. Multiply instructions put the high bits of the result here. rdx edx dx dh and dl
Preserved register: don’t use it without saving it! rbx ebx bx bh and bl
The stack pointer. Points to the top of the stack rsp esp sp spl
Preserved register. Sometimes used to store the old value of the stack pointer, or the “base”. rbp ebp bp bpl
Scratch register. Also used to pass function argument #2 in 64-bit mode (on Linux). rsi esi si sil
Scratch register. Function argument #1. rdi edi di dil
Scratch register. These were added in 64-bit mode, so the names are slightly different. r8 r8d r8w r8b
Scratch register. r9 r9d r9w r9b
Scratch register. r10 r10d r10w r10b
Scratch register. r11 r11d r11w r11b
Preserved register. r12 r12d r12w r12b
Preserved register. r13 r13d r13w r13b
Preserved register. r14 r14d r14w r14b
Preserved register. r15 r15d r15w r15b

Like C++ variables, registers are actually available in several sizes:

rax is the 64-bit, “long” size register. It was added in 2003 during the transition to 64-bit processors. eax is the 32-bit, “int” size register. It was added in 1985 during the transition to 32-bit processors with the 80386 CPU. I’m in the habit of using this register size, since they also work in 32 bit mode, although I’m trying to use the longer rax registers for everything. ax is the 16-bit, “short” size register. It was added in 1979 with the 8086 CPU, but is used in DOS or BIOS code to this day. al and ah are the 8-bit, “char” size registers. al is the low 8 bits, ah is the high 8 bits. They’re pretty similar to the old 8-bit registers of the 8008 back in 1972. Curiously, you can write a 64-bit value into rax, then read off the low 32 bits from eax, or the low 16 bitx from ax, or the low 8 bits from al–it’s just one register, but they keep on extending it!

some interesting behavior when write to register: | Source Size | | | | | | |————-|————-|—————-|————–|————–|——————————————-| | | 64 bit rcx | 32 bit ecx | 16 bit cx | 8 bit cl | Notes | | 64 bit rax | mov rax,rcx | movsxd rax,ecx | movsx rax,cx | movsx rax,cl | Writes to whole register | | 32 bit eax | mov eax,ecx | mov eax,ecx | movsx eax,cx | movsx eax,cl | Top half of destination gets zeroed | | 16 bit ax | mov ax,cx | mov ax,cx | mov ax,cx | movsx ax,cl | Only affects low 16 bits, rest unchanged. | | 8 bit al | mov al,cl | mov al,cl | mov al,cl | mov al,cl | Only affects low 8 bits, rest unchanged. |

0x1122334455667788
  ================ rax (64 bits)
          ======== eax (32 bits)
              ====  ax (16 bits)
              ==    ah (8 bits)
                ==  al (8 bits)

when write, for register under 32bits, only visit the part of the register

mov  eax, 0x11112222 ; eax = 0x11112222
mov  ax, 0x3333      ; eax = 0x11113333 (works, only low 16 bits changed)
mov  al, 0x44        ; eax = 0x11113344 (works, only low 8 bits changed)
mov  ah, 0x55        ; eax = 0x11115544 (works, only high 8 bits changed)
xor  ah, ah          ; eax = 0x11110044 (works, only high 8 bits cleared)
mov  eax, 0x11112222 ; eax = 0x11112222
xor  al, al          ; eax = 0x11112200 (works, only low 8 bits cleared)
mov  eax, 0x11112222 ; eax = 0x11112222
xor  ax, ax          ; eax = 0x11110000 (works, only low 16 bits cleared)
```However, when write to 64 bit register, thing is very interesting

```asm
mov  rax, 0x1111222233334444 ;           rax = 0x1111222233334444
mov  eax, 0x55556666         ; actual:   rax = 0x0000000055556666
                             ; expected: rax = 0x1111222255556666
                             ; upper 32 bits seem to be lost!
mov  rax, 0x1111222233334444 ;           rax = 0x1111222233334444
mov  ax, 0x7777              ;           rax = 0x1111222233337777 (works!)
mov  rax, 0x1111222233334444 ;           rax = 0x1111222233334444
xor  eax, eax                ; actual:   rax = 0x0000000000000000
                             ; expected: rax = 0x1111222200000000
                             ; again, it wiped whole register

when write from 32bit register to 64 bits register, the upper 32bit is zero-out.

register %eax

this register is where the value stored gonna C source –>
macro

eax is a 32-bit general-purpose register with two common uses: to store the return value of a function and as a special register for certain calculations

int main(){ 
    // return 4;
    return 5;
}

-O1

main:
 mov    eax,0x5
 ret    
 cs nop WORD PTR [rax+rax*1+0x0]

-O0

main:
 push   rbp
 mov    rbp,rsp
 mov    eax,0x5
 pop    rbp
 ret    
 cs nop WORD PTR [rax+rax*1+0x0]
 nop    DWORD PTR [rax+rax*1+0x0]

register %lea

struct Point
{
     int xcoord;
     int ycoord;
};

for

int y = points[i].ycoord;

asm is:

MOV EDX, [EBX + 8*EAX + 4]    ; right side is "effective address"

for

int *p = &points[i].ycoord;

asm is:

LEA ESI, [EBX + 8*EAX + 4]

In this case, you don’t want the value of ycoord, but its address. That’s where LEA (load effective address) comes in. Instead of a MOV, the compiler can generate cited from What’s the purpose of the LEA instruction?

LEA, the only instruction that performs memory addressing calculations but doesn’t actually address memory. LEA accepts a standard memory addressing operand, but does nothing more than store the calculated memory offset in the specified register, which may be any general purpose register.

member function call and function call are actually has similar asm generated

https://www.godbolt.org/z/ExE6rvKv1

picture:

interesting: $rdi works like a register for function parameter, in this case, the this is stored here

C++ storage size

QWORD : 64bits, pointer DWORD : 32bits, int WORD: 16bits, short

.text                                           # start of code indicator.
.globl _main                                    # make the main function visible to the outside.
_main:                                          # actually label this spot as the start of our main function.
    push    %rbp                            # save the base pointer to the stack.
    mov     %rsp, %rbp                      # put the previous stack pointer into the base pointer.
    subl    $8, %esp                        # Balance the stack onto a 16-byte boundary.
    movl    $0, %eax                        # Stuff 0 into EAX, which is where result values go.
    leave                                   # leave cleans up base and stack pointers again.
    ret
  1. in x86, stack grow downward.