PWN Cheatsheet
x86_64 Assembly
%raxis a reference toraxregister.$0x4is a constant.0(%esp)or(%esp)is the value loaded from the memory address that is stored in%esp.12(%esp)dereferences memory 12 bytes above the address contained in ESPmov A,B, B is changed to become A. A's value is -> put into B.- This is AT&T x86 syntax (what GDB uses). In this syntax the destination register comes last. In RISC-V assembly the opposite is true.
push AaddsAas the last element of the stack, decreases the value ofrsp/espby one word (since the stack grows from high to low).pop rax/eaxremoves the last element of the stack, copies intorax/eaxincreases the value ofrsp/espby one word (since the stack grows from high to low).jmpis equivalent tomov rip, target_addresscallis equivalent topush rip(old return address)andjmp dst_func.retis equivalent topop rip(old return address)which isjmp src_func.Because stack grows from high addresses to low addresses:
sub $8,%rspincreases stack size.add $8,%rspdecreases stack size.
Stack
Stack
Backgroud
- Each byte is 8 bits.
- Each word is 4 bytes in 32-bit systems and 8 bytes in 64-bit systems.
- Each register can store one word.
Registers
Pointing to the code section
rip/eipis the instruction pointer, and it stores the address of the machine instruction currently being executed (code section). In RISC-V, this register is called the PC (program counter).
Pointing to the stack section
rbp/ebpis the base pointer, and it stores the address of the base of the current stack frame. In RISC-V systems, this register is called the FP (frame pointer)4.rsp/espis the stack pointer, and it stores the address of the last element of the current stack frame. In RISC-V, this register is called the SP (stack pointer).
General Purpose Registers
- 16 general purpose registers in 64-bit systems:
RAX,RBX,RCX,RDX,RSI,RDIandR8toR15. - 6 general puprose registers in 32-bit systems:
EAX,EBX,ECX,EDX,ESI,EDI.
| Argument | Register (64-bit) |
|---|---|
| Arg 1 | %rdi |
| Arg 2 | %rsi |
| Arg 3 | %rdx |
| Arg 4 | %rcx |
| Arg 5 | %r8 |
| Arg 6 | %r9 |
Little Endian or Big Endian
- Address numbers point to bytes.
- Using little endian
0x44332211will be stored like0x11 0x22 0x33 0x44in memory, meaning that the most signifcant byte has the highest address number (the memory starts from0x0to0x3, and0x44is stored at0x3).
Function Call in x86
See the photos in UC Berkeley's CS161 Textbook
- Function args are pushed into the stack. (or onto the general registers such as
rdi,rsi,rdx,rcxif SysV x86-64) - Return address (old
rip) is pushed into the stack. - Function jump happens (
rip= newrip) - The new function pushes old
rbponto the stack. rbpis changed to become the currentrsp(old function's last element is the new functions bottom)rspis grown to fit the new function.- The new function does whatever it does
- At the end of the new function's run, we are at its
rbp, which is the old one'srsp.We changerspto become the currentrbp. - Old
rbpis poped back. - Old
ripis poped back. - Decrease the size of the stack by the size of args.
Example C code
void foo(int a, int b) {
int bar[4];
}
int main(void) {
foo(1, 2);
return 0;
}Which gives the following assembly:
0000000000001129 <foo>:
1129: 55 push %rbp // Step 4. RBP is saved on the stack right below RIP (saved in the previous step).
112a: 48 89 e5 mov %rsp,%rbp // Step 5. The new stack base is the previous stack pointer.
112d: 89 7d ec mov %edi,-0x14(%rbp) // Foo saves first arg into its own stack to free up %edi
1130: 89 75 e8 mov %esi,-0x18(%rbp) // Foo saves second arg into its own stack to free up %esi
1133: 90 nop // Alignment, could be Step 6. =sub $N, %rsp= & Step. 7 if our function involved anything.
// Since Step. 6 didn't take place, rsp didn't change.
// therefore =mov %rsp, %rbp= (Step 8) is not run either.
// We could change this behavior using gcc flag -mno-red-zone
1134: 5d pop %rbp // Step 9. Old RBP is restored
1135: c3 ret // == pop %rip which is Step 10. Old RIP is restored
// Step 11. doesn't happen since in the x86-64 ABI, since the caller
// passed arguments in registers, there is nothing on the stack for
// the caller to clean up
0000000000001136 <main>:
1136: 55 push %rbp // This is step 4 since main is not the starting function and is being called from _start
1137: 48 89 e5 mov %rsp,%rbp // this is also step 5 but we don't analyze main.
113a: be 02 00 00 00 mov $0x2,%esi // Step 1. Saving args (from right to left): 2
113f: bf 01 00 00 00 mov $0x1,%edi // Step 1. Saving args (from right to left): 1
1144: e8 e0 ff ff ff call 1129 <foo> // Step 2. RIP is pushed on the stack. Step 3. RIP is changed to the func address of foo (00001129).
1149: b8 00 00 00 00 mov $0x0,%eax
114e: 5d pop %rbp
114f: c3 retBinary Background Check
| Purpose | Command |
| Check security properties (NX bit, dynamic vs static) | rabin2 -I a.out |
| List functions imported from shared libraries | rabin2 -i a.out |
| Find functions likely written by the programmer | rabin2 -qs a.out piped to grep -ve imp -e ' 0 ' |
| Find strings | strings a.out or (better) rabin2 -z split |
Misc
- PLT (Procedure Linkage Table): a small trampoline in the binary used for calls to external (dynamically linked) functions. Calls in the code go to func@plt; the PLT entry jumps to the real function address found in the GOT (or invokes the dynamic resolver on first call).
- GOT (Global Offset Table): a writable table of addresses in the binary. Each GOT slot holds the runtime address of a library function (or other dynamic symbol). The dynamic linker/loader fills GOT entries the first time a PLT entry is used (or at load time, depending on RELRO).
- They let the loader resolve and redirect calls to shared-library functions without embedding fixed library addresses in the code (works with ASLR and dynamic linking).
Sources
Disclaimer: Some of the sentences here are a direct copy of the links below, so I don't own their copyright. Follow the original sources.