Skip to content

Instantly share code, notes, and snippets.

@suntong
Created January 17, 2026 04:31
Show Gist options
  • Select an option

  • Save suntong/5bd3fbf958ba787f34200b66b719e4ad to your computer and use it in GitHub Desktop.

Select an option

Save suntong/5bd3fbf958ba787f34200b66b719e4ad to your computer and use it in GitHub Desktop.

Table of Contents

  1. Part 1: The Foundation (16-Bit / 8086) - Registers, Segmentation, and Basic Logic.
  2. Part 2: The Expansion (32-Bit / x86 / IA-32) - E-Registers, Flat Memory, and Stack Frames.
  3. Part 3: The Modern Era (64-Bit / x86-64 / AMD64) - R-Registers, New Calling Conventions, and RIP-Relative addressing.
  4. Premium Quick Reference Card - Register Hierarchy, Instructions, and Addressing Modes.

Premium Guide to x86 Assembly

This guide is structured chronologically to help you understand the evolution of the architecture. Understanding the 16-bit roots makes the 64-bit complexity much easier to digest.

Note on Syntax: We will use Intel Syntax (NASM style), where the format is Instruction Destination, Source.


Part 1: The Foundation (16-Bit / 8086)

The era of DOS and Real Mode.

In the beginning, registers were 16 bits wide. Memory was accessed using "Segmentation" (Segment:Offset), because a single 16-bit register could only address 64KB of RAM.

1. The General Purpose Registers

These 4 registers are the workhorses. They can be accessed as a whole (16-bit) or split into high and low bytes (8-bit).

Register Name 16-bit Access High 8-bit Low 8-bit Primary Use
A Accumulator AX AH AL Math, logic, I/O results.
B Base BX BH BL Memory indexing (base pointer).
C Counter CX CH CL Loops and string operations.
D Data DX DH DL I/O, multiply/divide helper.

2. Pointer and Index Registers

These are primarily 16-bit and used for memory addresses.

  • SP (Stack Pointer): Points to the top of the stack.
  • BP (Base Pointer): Used to access function parameters on the stack.
  • SI (Source Index): Source for string copies.
  • DI (Destination Index): Destination for string copies.
  • IP (Instruction Pointer): Points to the next instruction to execute (cannot be modified directly).

3. Basic Logic (16-bit Example)

mov ax, 5       ; Load 5 into AX
mov bx, 3       ; Load 3 into BX
add ax, bx      ; AX = AX + BX (AX is now 8)
sub ax, 1       ; AX = AX - 1  (AX is now 7)

Part 2: The Expansion (32-Bit / x86 / IA-32)

The era of Windows 95/XP and Linux. Introduction of Protected Mode.

With the 80386, registers expanded to 32 bits. The prefix 'E' stands for Extended.

  • Memory became a "Flat Model" (up to 4GB), removing the headache of segmentation.

1. The Extended Registers

The old 16-bit registers (AX) are now the lower half of the 32-bit registers (EAX).

32-bit Register Contains 16-bit Description
EAX AX Extended Accumulator
EBX BX Extended Base
ECX CX Extended Counter
EDX DX Extended Data
ESP SP Extended Stack Pointer
EBP BP Extended Base Pointer
ESI SI Extended Source Index
EDI DI Extended Dest Index

2. The Stack Frame

In 32-bit assembly, arguments are almost always passed via the Stack.

  1. Push arguments onto the stack.
  2. Call the function.
  3. Function uses EBP to read arguments.

3. 32-bit Example (Linux Syscall)

section .data
    msg db 'Hello World', 0xA  ; String with newline

section .text
    global _start

_start:
    ; write(fd, buf, count)
    mov eax, 4      ; Syscall number for sys_write
    mov ebx, 1      ; File Descriptor 1 (stdout)
    mov ecx, msg    ; Pointer to the message
    mov edx, 12     ; Length of message
    int 0x80        ; Interrupt kernel to execute

    ; exit(status)
    mov eax, 1      ; Syscall number for sys_exit
    mov ebx, 0      ; Exit code 0
    int 0x80

Part 3: The Modern Era (64-Bit / x86-64 / AMD64)

Current Standard. Massive memory, more registers.

Registers expanded to 64 bits with the prefix 'R'. We also gained 8 completely new registers (R8-R15).

1. The Register Hierarchy

This is the most important concept to visualize.

  • RAX is 64 bits.
  • EAX is the lower 32 bits of RAX.
  • AX is the lower 16 bits of EAX.
  • AL is the lower 8 bits of AX.

New Registers (R8 - R15): Access style: R8 (64b), R8D (32b), R8W (16b), R8B (8b).

2. Calling Convention Changes

In 64-bit, we stop using the stack for the first few arguments. We use Registers, which is much faster.

  • Linux (System V ABI): First 6 args go in RDI, RSI, RDX, RCX, R8, R9.
  • Windows (Microsoft x64): First 4 args go in RCX, RDX, R8, R9.

3. RIP-Relative Addressing

You can now access data relative to the current instruction pointer (RIP), enabling "Position Independent Code" (PIC), crucial for modern shared libraries.

4. 64-bit Example (Linux Syscall)

Notice the register changes (R prefix) and the syscall instruction (syscall instead of int 0x80).

section .data
    msg db 'Hello 64-bit World', 0xA

section .text
    global _start

_start:
    ; sys_write uses RDI, RSI, RDX
    mov rax, 1       ; Syscall ID for write (it is 1 in 64-bit, 4 in 32-bit!)
    mov rdi, 1       ; File Descriptor (stdout)
    mov rsi, msg     ; Address of string
    mov rdx, 19      ; Length
    syscall          ; Invoke Kernel

    ; sys_exit
    mov rax, 60      ; Syscall ID for exit
    xor rdi, rdi     ; Exit code 0 (xor rdi, rdi is faster than mov rdi, 0)
    syscall

Premium Quick Reference Card

Register Hierarchy Map

64-bit 32-bit 16-bit High 8 Low 8
RAX EAX AX AH AL
RBX EBX BX BH BL
RCX ECX CX CH CL
RDX EDX DX DH DL
RSI ESI SI - SIL
RDI EDI DI - DIL
RBP EBP BP - BPL
RSP ESP SP - SPL
R8 R8D R8W - R8B
... ... ... ... ...
R15 R15D R15W - R15B

Data Movement

Instruction Syntax Description
MOV mov dest, src Copies data from src to dest.
XCHG xchg op1, op2 Swaps the values of two operands.
PUSH push src Pushes value onto the stack (RSP decrements).
POP pop dest Pops value from stack (RSP increments).
LEA lea reg, [mem] Load Effective Address. Calculates the address, doesn't read memory.

Arithmetic

Instruction Syntax Description
ADD add dest, src dest = dest + src
SUB sub dest, src dest = dest - src
INC inc dest dest = dest + 1 (Fast)
DEC dec dest dest = dest - 1 (Fast)
IMUL imul dest, src Signed multiplication.
IDIV idiv src Signed division (Implicitly uses RDX:RAX).

Logic & Control Flow

Instruction Syntax Description
AND / OR / XOR and dest, src Bitwise operations. xor rax, rax zeros a register.
CMP cmp op1, op2 Compares op1 and op2. Sets status flags (Zero, Sign, Overflow).
JMP jmp label Unconditional jump.
JE / JZ je label Jump if Equal / Zero (ZF=1).
JNE / JNZ jne label Jump if Not Equal / Not Zero (ZF=0).
JG / JL jg label Jump if Greater / Less (Signed).
CALL call label Push next instruction address to stack, jump to label.
RET ret Pop address from stack, jump to it.

Essential Memory Addressing Modes

How to access memory values in [].

  1. Direct: mov rax, [var_name]
  2. Indirect: mov rax, [rbx] (Address is in RBX)
  3. Base + Offset: mov rax, [rbx + 8]
  4. Indexed: mov rax, [rbx + rcx*4] (Great for arrays: Base + Index * Scale)

Pro-Tip: The "Zero Extension" Quirk

In x86-64, if you write to a 32-bit register, the CPU automatically zeros the upper 32 bits.

  • mov rax, 0xFFFFFFFFFFFFFFFF (RAX is all 1s)
  • mov eax, 0xFFFFFFFF (Now, RAX is 0x00000000FFFFFFFF)

This does not happen with 8-bit or 16-bit writes (they preserve the upper bits).

1. The Two Main Flavors: Intel vs. AT&T

Note: mov %rdx, %r14 is NOT the same as mov rdx, r14. It is the reverse.

  • Intel Syntax (Windows/NASM): Instruction Destination, Source (Think: Dest = Source)
  • AT&T Syntax (Linux/GAS): Instruction Source, Destination (Think: Source -> Destination)

Quick Comparison Matrix

Feature Intel Syntax (NASM) AT&T Syntax (GAS) Note
Operand Order mov dest, src mov src, dest CRITICAL DIFFERENCE
Registers rax, rbx %rax, %rbx AT&T uses % prefix.
Constants 5, 0x10 $5, $0x10 AT&T uses $ for immediates.
Memory [rbp - 4] -4(%rbp) AT&T uses () and displacements outside.
Size Suffixes Explicit dword ptr Suffixes movl, movq q=64, l=32, w=16, b=8.

2. Example Snippets

Here is translation of sample codes from AT&T syntax into logic/Intel syntax.

The Register Move

AT&T: mov %rdx, %r14

  • Intel equivalent: mov r14, rdx
  • Meaning: Copy the value from register RDX into register R14.

The Code Block Analysis

1. mov (%r14), %eax

  • Intel: mov eax, [r14]
  • Meaning: Go to the memory address stored in R14. Read 32 bits (because destination is EAX, which is 32-bit). Store that value in EAX.

2. movl $0x6, 0x30(%rsp)

  • Intel: mov dword ptr [rsp + 0x30], 6
  • Meaning: Take the number 6 (immediate). Store it into the memory address located at RSP plus 0x30 bytes (48 decimal).
  • Note: movl indicates a 32-bit move (Long).

3. mov %eax, 0x20(%rsp)

  • Intel: mov [rsp + 0x20], eax
  • Meaning: Take the value currently in EAX. Write it to the Stack Pointer address plus offset 0x20.

4. movl $0x6, 0x50(%rsp)

  • Intel: mov dword ptr [rsp + 0x50], 6
  • Meaning: Store the number 6 into stack memory at offset 0x50.

5. lea 0x32c0145(%rip), %rdx

  • Intel: lea rdx, [rel 0x32c0145] (roughly)
  • Meaning: Calculate the address: (Current Instruction Pointer + 0x32c0145). Store that address in RDX.
  • Context: This is RIP-Relative addressing, used heavily in 64-bit code to access global variables regardless of where the program is loaded in memory.

6. lea 0x20(%rsp), %rcx

  • Intel: lea rcx, [rsp + 0x20]
  • Meaning: Calculate the address RSP + 0x20. Store that address in RCX. It does not read memory; it just does math on the pointer.

3. AT&T Addressing Modes (The "Full Cover")

The previous guide covered the logic (Base + Index), but the syntax works differently.

In AT&T syntax, the general format for memory access is: Displacement(Base, Index, Scale)

This calculates the address: Base + (Index * Scale) + Displacement.

Examples of Variations

1. Indirect (Just the pointer)

  • AT&T: (%rax)
  • Intel: [rax]
  • Logic: Access memory at address in RAX.

2. Displacement (Pointer + Offset)

  • AT&T: 8(%rax)
  • Intel: [rax + 8]
  • Logic: Access memory at RAX + 8.

3. Indexed (Base + Index)

  • AT&T: (%rax, %rcx)
  • Intel: [rax + rcx]
  • Logic: Access memory at RAX + RCX.

4. Scaled Index (Array Access)

  • AT&T: (%rax, %rcx, 4)
  • Intel: [rax + rcx*4]
  • Logic: Access memory at RAX + (RCX * 4). (Useful for 32-bit integer arrays).

5. The "Full House" (Complex)

  • AT&T: -8(%rbp, %rdx, 8)
  • Intel: [rbp + rdx*8 - 8]
  • Logic: Base RBP, add RDX * 8, then subtract 8.

4. Instruction Suffixes (Size Matters)

For snippet like movl. In AT&T syntax, because the operand order is weird and registers are prefixed with %, the assembler often needs a hint about how much data to move. It appends a letter to the instruction opcode.

Suffix Size Name Intel Equivalent
b 8 bits Byte byte ptr
w 16 bits Word word ptr
l 32 bits Long dword ptr
q 64 bits Quad qword ptr

Example:

  • movl $1, (%rax) means "Write the 32-bit number 1 to the address in RAX."
  • movq $1, (%rax) means "Write the 64-bit number 1 to the address in RAX."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment