- Part 1: The Foundation (16-Bit / 8086) - Registers, Segmentation, and Basic Logic.
- Part 2: The Expansion (32-Bit / x86 / IA-32) - E-Registers, Flat Memory, and Stack Frames.
- Part 3: The Modern Era (64-Bit / x86-64 / AMD64) - R-Registers, New Calling Conventions, and RIP-Relative addressing.
- Premium Quick Reference Card - Register Hierarchy, Instructions, and Addressing Modes.
This guide is structured chronologically to help you understand the evolution of the architecture. Understanding the 16-bit roots makes the 64-bit complexity much easier to digest.
Note on Syntax: We will use Intel Syntax (NASM style), where the format is Instruction Destination, Source.
The era of DOS and Real Mode.
In the beginning, registers were 16 bits wide. Memory was accessed using "Segmentation" (Segment:Offset), because a single 16-bit register could only address 64KB of RAM.
These 4 registers are the workhorses. They can be accessed as a whole (16-bit) or split into high and low bytes (8-bit).
| Register | Name | 16-bit Access | High 8-bit | Low 8-bit | Primary Use |
|---|---|---|---|---|---|
| A | Accumulator | AX |
AH |
AL |
Math, logic, I/O results. |
| B | Base | BX |
BH |
BL |
Memory indexing (base pointer). |
| C | Counter | CX |
CH |
CL |
Loops and string operations. |
| D | Data | DX |
DH |
DL |
I/O, multiply/divide helper. |
These are primarily 16-bit and used for memory addresses.
- SP (Stack Pointer): Points to the top of the stack.
- BP (Base Pointer): Used to access function parameters on the stack.
- SI (Source Index): Source for string copies.
- DI (Destination Index): Destination for string copies.
- IP (Instruction Pointer): Points to the next instruction to execute (cannot be modified directly).
mov ax, 5 ; Load 5 into AX
mov bx, 3 ; Load 3 into BX
add ax, bx ; AX = AX + BX (AX is now 8)
sub ax, 1 ; AX = AX - 1 (AX is now 7)The era of Windows 95/XP and Linux. Introduction of Protected Mode.
With the 80386, registers expanded to 32 bits. The prefix 'E' stands for Extended.
- Memory became a "Flat Model" (up to 4GB), removing the headache of segmentation.
The old 16-bit registers (AX) are now the lower half of the 32-bit registers (EAX).
| 32-bit Register | Contains 16-bit | Description |
|---|---|---|
| EAX | AX |
Extended Accumulator |
| EBX | BX |
Extended Base |
| ECX | CX |
Extended Counter |
| EDX | DX |
Extended Data |
| ESP | SP |
Extended Stack Pointer |
| EBP | BP |
Extended Base Pointer |
| ESI | SI |
Extended Source Index |
| EDI | DI |
Extended Dest Index |
In 32-bit assembly, arguments are almost always passed via the Stack.
- Push arguments onto the stack.
- Call the function.
- Function uses
EBPto read arguments.
section .data
msg db 'Hello World', 0xA ; String with newline
section .text
global _start
_start:
; write(fd, buf, count)
mov eax, 4 ; Syscall number for sys_write
mov ebx, 1 ; File Descriptor 1 (stdout)
mov ecx, msg ; Pointer to the message
mov edx, 12 ; Length of message
int 0x80 ; Interrupt kernel to execute
; exit(status)
mov eax, 1 ; Syscall number for sys_exit
mov ebx, 0 ; Exit code 0
int 0x80Current Standard. Massive memory, more registers.
Registers expanded to 64 bits with the prefix 'R'. We also gained 8 completely new registers (R8-R15).
This is the most important concept to visualize.
- RAX is 64 bits.
- EAX is the lower 32 bits of RAX.
- AX is the lower 16 bits of EAX.
- AL is the lower 8 bits of AX.
New Registers (R8 - R15):
Access style: R8 (64b), R8D (32b), R8W (16b), R8B (8b).
In 64-bit, we stop using the stack for the first few arguments. We use Registers, which is much faster.
- Linux (System V ABI): First 6 args go in
RDI,RSI,RDX,RCX,R8,R9. - Windows (Microsoft x64): First 4 args go in
RCX,RDX,R8,R9.
You can now access data relative to the current instruction pointer (RIP), enabling "Position Independent Code" (PIC), crucial for modern shared libraries.
Notice the register changes (R prefix) and the syscall instruction (syscall instead of int 0x80).
section .data
msg db 'Hello 64-bit World', 0xA
section .text
global _start
_start:
; sys_write uses RDI, RSI, RDX
mov rax, 1 ; Syscall ID for write (it is 1 in 64-bit, 4 in 32-bit!)
mov rdi, 1 ; File Descriptor (stdout)
mov rsi, msg ; Address of string
mov rdx, 19 ; Length
syscall ; Invoke Kernel
; sys_exit
mov rax, 60 ; Syscall ID for exit
xor rdi, rdi ; Exit code 0 (xor rdi, rdi is faster than mov rdi, 0)
syscall| 64-bit | 32-bit | 16-bit | High 8 | Low 8 |
|---|---|---|---|---|
| RAX | EAX | AX | AH | AL |
| RBX | EBX | BX | BH | BL |
| RCX | ECX | CX | CH | CL |
| RDX | EDX | DX | DH | DL |
| RSI | ESI | SI | - | SIL |
| RDI | EDI | DI | - | DIL |
| RBP | EBP | BP | - | BPL |
| RSP | ESP | SP | - | SPL |
| R8 | R8D | R8W | - | R8B |
| ... | ... | ... | ... | ... |
| R15 | R15D | R15W | - | R15B |
| Instruction | Syntax | Description |
|---|---|---|
| MOV | mov dest, src |
Copies data from src to dest. |
| XCHG | xchg op1, op2 |
Swaps the values of two operands. |
| PUSH | push src |
Pushes value onto the stack (RSP decrements). |
| POP | pop dest |
Pops value from stack (RSP increments). |
| LEA | lea reg, [mem] |
Load Effective Address. Calculates the address, doesn't read memory. |
| Instruction | Syntax | Description |
|---|---|---|
| ADD | add dest, src |
dest = dest + src |
| SUB | sub dest, src |
dest = dest - src |
| INC | inc dest |
dest = dest + 1 (Fast) |
| DEC | dec dest |
dest = dest - 1 (Fast) |
| IMUL | imul dest, src |
Signed multiplication. |
| IDIV | idiv src |
Signed division (Implicitly uses RDX:RAX). |
| Instruction | Syntax | Description |
|---|---|---|
| AND / OR / XOR | and dest, src |
Bitwise operations. xor rax, rax zeros a register. |
| CMP | cmp op1, op2 |
Compares op1 and op2. Sets status flags (Zero, Sign, Overflow). |
| JMP | jmp label |
Unconditional jump. |
| JE / JZ | je label |
Jump if Equal / Zero (ZF=1). |
| JNE / JNZ | jne label |
Jump if Not Equal / Not Zero (ZF=0). |
| JG / JL | jg label |
Jump if Greater / Less (Signed). |
| CALL | call label |
Push next instruction address to stack, jump to label. |
| RET | ret |
Pop address from stack, jump to it. |
How to access memory values in [].
- Direct:
mov rax, [var_name] - Indirect:
mov rax, [rbx](Address is in RBX) - Base + Offset:
mov rax, [rbx + 8] - Indexed:
mov rax, [rbx + rcx*4](Great for arrays: Base + Index * Scale)
In x86-64, if you write to a 32-bit register, the CPU automatically zeros the upper 32 bits.
mov rax, 0xFFFFFFFFFFFFFFFF(RAX is all 1s)mov eax, 0xFFFFFFFF(Now, RAX is0x00000000FFFFFFFF)
This does not happen with 8-bit or 16-bit writes (they preserve the upper bits).