thehypergraph/postgres-study.md

## postgres-study.md

      
    Raw
  

              postgres-study.md
            
          
    PostgreSQL Source Code Study Guide

A map of the codebase for learning how PostgreSQL works from the inside out.

How a Query Flows Through the System

Client
  ↓  (TCP/Unix socket)
postmaster        — forks a new backend per connection
  ↓
backend startup   — auth, GUC init, shared memory attach
  ↓
tcop/postgres.c   — ReadCommand() loop: read wire protocol message
  ↓
parser/           — raw lex/parse → raw parse tree
  ↓
parser/analyze.c  — semantic analysis → Query node
  ↓
rewrite/          — apply rules → list of Query nodes
  ↓
optimizer/        — Query → PlannedStmt (cheapest Path → Plan tree)
  ↓
executor/         — execute Plan tree, produce tuples
  ↓
tcop/dest.c       — send result rows back to client (libpq)


Directory Map

src/backend/ — The Server


postmaster/ — Process Manager

The master daemon. Never touches shared memory itself. It listens for connections, forks a backend for each one, and manages auxiliary processes (autovacuum launcher, WAL writer, bgwriter, checkpointer, etc.). On crash of any backend it can reset shared memory and restart the cluster.
Key files:

postmaster.c — main loop, fork-on-connect, signal handling
autovacuum.c — autovacuum launcher and worker logic
bgwriter.c, checkpointer.c, walwriter.c — background I/O workers

Key structs:

PGPROC (storage/proc.h) — one per backend in shared memory; holds XID, lock wait state, latch


tcop/ — Traffic Cop (Query Dispatch)

The per-backend main loop. Reads messages from the client over the wire protocol, dispatches to parser → optimizer → executor, and writes results back. "tcop" stands for traffic cop.
Key files:

postgres.c — PostgresMain() message loop; exec_simple_query() ties together parse/plan/execute
pquery.c — PortalRun(), executing a portal (cursor abstraction)
utility.c — handles non-optimizable statements (DDL, COPY, VACUUM, etc.)
dest.c — output destination abstraction; swappable sinks (client, /dev/null, file)

Key structs:

Portal (utils/portal.h) — represents an open cursor or active query; holds PlannedStmt + ExecutorState
DestReceiver — vtable for tuple output (print to client, store to tuplestore, etc.)


parser/ — Lexer, Parser, Semantic Analysis

Converts SQL text into an internal Query node. Two phases:

Raw parse (gram.y + scan.l) → raw parse tree with RawStmt nodes
Analyze (analyze.c) → resolves names, type-checks, builds Query

Key files:

scan.l — flex lexer (tokenizes SQL text)
gram.y — bison grammar (800+ rules); outputs a raw parse tree
analyze.c — top-level semantic analysis, calls out to parse_*.c helpers
parse_expr.c, parse_clause.c, parse_coerce.c, parse_agg.c — expression/clause analysis

Key structs (all in src/include/nodes/parsenodes.h):

RawStmt — wrapper around a raw parse tree node with source location
Query — the main output; holds commandType, rtable (range table), jointree, targetList, havingQual, etc.
RangeTblEntry (RTE) — one entry per table/subquery/function in the FROM clause
TargetEntry — one column in the SELECT list (wraps an Expr, has resno, resname)


rewrite/ — Rule System

Applies stored rewrite rules (CREATE RULE) to a Query, potentially producing multiple Queries. Also handles view expansion (views are rules).
Key files:

rewriteHandler.c — QueryRewrite() entry point
rewriteManip.c — tree manipulation helpers

Key structs:

RewriteRule / RuleLock (rewrite/prs2lock.h) — cached rule definitions on a relation


optimizer/ — Query Planner

Takes a Query and returns a PlannedStmt. Organized into subdirectories:

prep/ — preprocessing: sublink flattening, subquery pull-up, partition pruning
path/ — builds Path trees (all possible access/join strategies); cost estimation lives here
plan/ — converts the winning Path tree into a Plan tree
geqo/ — genetic algorithm for join ordering when there are many tables
util/ — RelOptInfo, PathKey, restriction/join clause utilities

The planner builds a RelOptInfo for each base relation and each candidate join. For each RelOptInfo it generates all plausible Paths, estimates their cost, selects the cheapest, and recurses. See src/backend/optimizer/README for the full algorithm.
Key files:

optimizer/plan/planner.c — planner() entry point
optimizer/path/allpaths.c — make_one_rel(), generates all access paths
optimizer/path/costsize.c — all cost formulas (cost_seqscan, cost_index, cost_hashjoin, …)
optimizer/path/joinpath.c — generates NestLoop, MergeJoin, HashJoin paths
optimizer/plan/createplan.c — create_plan() turns winning Path into Plan nodes

Key structs (src/include/nodes/pathnodes.h, plannodes.h):

PlannerInfo — planner's working state for a single query level (holds simple_rel_array, join_rel_list, etc.)
RelOptInfo — represents a base relation or join rel during planning; has pathlist, cheapest_total_path, rows estimate
Path — base type; subtypes: IndexPath, BitmapHeapPath, NestPath, MergePath, HashPath, AppendPath
PlannedStmt — final output: has planTree (root Plan node), rtable, resultRelations
Plan node types (plannodes.h): SeqScan, IndexScan, BitmapHeapScan, NestLoop, MergeJoin, HashJoin, Hash, Agg, Sort, Limit, ModifyTable, etc.


executor/ — Query Executor

Walks the Plan tree using a demand-pull (volcano/iterator) model. Each plan node has ExecInitFoo() / ExecFoo() / ExecEndFoo() functions. Calling ExecProcNode() on the root returns one tuple at a time.
Key files:

execMain.c — ExecutorStart(), ExecutorRun(), ExecutorFinish(), ExecutorEnd()
execProcnode.c — dispatch table: ExecInitNode(), ExecProcNode(), ExecEndNode()
execExpr.c / execExprInterp.c — expression evaluation engine (JIT-compilable)
nodeSeqscan.c, nodeIndexscan.c, nodeNestloop.c, nodeHashjoin.c, nodeAgg.c, nodeSort.c — one file per plan node type

Key structs (src/include/nodes/execnodes.h):

EState — executor state for the whole query: snapshot, result relation, estate tuple table
PlanState — mirrors each Plan node; subtypes: SeqScanState, IndexScanState, NestLoopState, AggState, etc.
ExprContext — holds the current tuple(s) being evaluated (ecxt_scantuple, ecxt_innertuple, ecxt_outertuple)
ExprState — compiled expression: array of ExprEvalStep instructions interpreted by ExecInterpExpr()
TupleTableSlot — in-memory tuple representation; may be heap, minimal, or virtual


access/ — Table & Index Access Methods

access/heap/ — Heap (Table) Storage

The physical format of table rows on disk. Each row is a HeapTuple on a page. MVCC is implemented here via xmin/xmax fields in every tuple header.

heapam.c — heap_insert(), heap_update(), heap_delete(), heap_getnext() (sequential scan)
hio.c — finding free space and writing tuples to pages
heapam_visibility.c — HeapTupleSatisfiesSnapshot() and friends (MVCC visibility checks)

Key structs:

HeapTupleHeaderData (access/htup_details.h) — on-disk tuple header: t_xmin, t_xmax, t_infomask, t_hoff (header length), null bitmap
HeapTupleData — in-memory wrapper: pointer to HeapTupleHeaderData + t_self (ItemPointerData = block + offset)

access/nbtree/ — B-Tree Index

Lehman-Yao concurrent B-tree. Supports equality and range queries. Every index page has a "high key" and a right-link pointer enabling safe concurrent splits.

nbtree.c — AM interface (btinsert, btgettuple, btbuild, …)
nbtsearch.c — descend tree to find leaf, scan forward/backward
nbtpage.c — page-level operations, splits
nbtinsert.c — insert and page-split logic

Key structs:

BTPageOpaqueData — extra data at end of each B-tree page: btpo_prev, btpo_next, btpo_level, btpo_flags
BTScanOpaqueData — scan state: current page/offset, scan keys

access/gin/ — GIN (Generalized Inverted Index)

For full-text search and array/jsonb containment. Stores (key → posting list) pairs.
access/gist/ — GiST (Generalized Search Tree)

Framework for geometric and other custom index types (PostGIS uses this).
access/brin/ — BRIN (Block Range Index)

Stores min/max (or other summaries) per range of heap pages. Very compact; good for naturally ordered large tables.
access/hash/ — Hash Index

Equality-only index using extensible hashing.
access/spgist/ — SP-GiST (Space-Partitioned GiST)

For space-partitioning structures (quad-trees, k-d trees, radix tries).
access/transam/ — Transaction Management

The heart of MVCC. Manages transaction IDs, commit log, snapshots, WAL logging.

xact.c — StartTransaction(), CommitTransaction(), AbortTransaction()
clog.c — commit log (pg_xact): tracks commit/abort status of every XID as 2 bits
subtrans.c — subtransaction parent tracking (pg_subtrans)
multixact.c — MultiXactId for row-level locking by multiple transactions
xlog.c — WAL write path; XLogInsert(), XLogFlush()
xlogrecovery.c — WAL replay on startup / standby

Key structs:

TransactionId (uint32) — XID; wraps around after ~4 billion
SnapshotData (utils/snapshot.h) — xmin, xmax, xip[] (in-progress XIDs): defines what a transaction can see

access/table/ — Table AM (Access Method) API

Abstraction layer introduced in PG12 so pluggable storage engines can implement the table interface. The heap uses this.
access/sequence/ — Sequences

Sequence object implementation.

storage/ — Storage Layer

storage/buffer/ — Shared Buffer Manager

The buffer pool: shared memory cache of disk pages. All access to table/index data goes through this.

bufmgr.c — ReadBuffer(), ReleaseBuffer(), MarkBufferDirty(), LockBuffer()
buf_table.c — hash table mapping (RelFileLocator, BlockNumber) → buffer slot
freelist.c — clock-sweep buffer replacement algorithm

Key structs:

BufferDesc (storage/buf_internals.h) — one per buffer slot: tag (which page), state (pin count, flags), content lock
Buffer (int32) — index into buffer array; positive = shared, negative = local, InvalidBuffer = 0

storage/smgr/ — Storage Manager

Thin abstraction over the OS filesystem. Currently only one implementation (md.c = "magnetic disk").

smgr.c — dispatch layer
md.c — actual file open/read/write/sync; manages relation "forks" (main, FSM, VM)

storage/lmgr/ — Lock Manager

Four lock levels: spinlocks, LWLocks, regular locks, predicate locks.

lock.c — regular (heavyweight) lock table in shared memory; deadlock detection
lwlock.c — lightweight locks; reader/writer semantics; used for short critical sections
lmgr.c — relation-level and tuple-level lock helpers
predicate.c — SSI (serializable snapshot isolation) predicate locks
proc.c — PGPROC management; lock wait queues

Key structs:

LOCK — one per distinct locked object in shared lock table
PROCLOCK — connects a PGPROC to a LOCK; tracks held/requested modes
LWLock — lightweight lock: state word encodes exclusive/shared holders and waiters

storage/page/ — Page Layout

Every heap and index page is 8 KB (default BLCKSZ). The page has a header, an item pointer array growing forward, free space in the middle, and tuple data growing backward.
Key structs:

PageHeaderData (storage/bufpage.h) — pd_lsn (WAL position), pd_lower/pd_upper (free space boundaries), pd_special (index-specific area offset), pd_pagesize_version, pd_checksum
ItemIdData — 4-byte item pointer on the page: offset, length, flags (normal/redirect/dead/unused)

storage/freespace/ — Free Space Map (FSM)

A separate file fork per relation storing free space estimates per page (1 byte = 1/256 of a page). Used by the heap insert path to find pages with room.
storage/ipc/ — Inter-Process Communication

Shared memory setup, semaphores, signal handling, PGPROC latch (condition variable for waiting processes).
storage/aio/ — Asynchronous I/O (PG17+)

Infrastructure for async read/write, enabling prefetching and parallel I/O.

catalog/ — System Catalogs

The system catalogs are regular tables (pg_class, pg_attribute, pg_type, pg_proc, etc.) stored in the database. This directory has the C code that creates and manipulates them.

pg_class.c, pg_attribute.c, pg_type.c, pg_proc.c — catalog manipulation for each catalog
heap.c — heap_create_with_catalog(): creating a new table updates many catalogs atomically
index.c — index creation
dependency.c — dependency tracking (DROP CASCADE logic)
namespace.c — schema/namespace search path resolution

Key structs:

Form_pg_class — C struct mapping a pg_class row (table/index/view metadata)
Form_pg_attribute — column descriptor from pg_attribute
Form_pg_type — type descriptor from pg_type


commands/ — SQL Command Implementations

DDL and utility commands: CREATE TABLE, ALTER TABLE, COPY, VACUUM, ANALYZE, CLUSTER, etc.
Key files:

tablecmds.c — CREATE/ALTER/DROP TABLE (very large file)
vacuum.c / vacuumlazy.c — VACUUM; dead tuple removal, FSM update, visibility map update
analyze.c — ANALYZE; sampling and statistics collection into pg_statistic
copy.c — COPY TO/FROM; bulk data load/export
indexcmds.c — CREATE INDEX, REINDEX
explain.c — EXPLAIN output formatting
sequence.c — CREATE SEQUENCE, nextval/currval/setval
trigger.c — trigger management and firing
event_trigger.c — DDL event triggers


nodes/ — Node Infrastructure

All parse/plan/executor tree nodes inherit from Node (a struct whose first field is NodeTag). This directory provides generic tree operations.

nodes.c — newNode(), nodeTag()
copyfuncs.c — copyObject(): deep copy any node tree
equalfuncs.c — equal(): structural equality of two node trees
outfuncs.c — nodeToString(): serialize a node to text (used in EXPLAIN and debugging)
readfuncs.c — stringToNode(): deserialize (used in stored plan cache)
list.c — List type: a generic singly-linked list used everywhere

Key structs:

Node — base: { NodeTag type; }
List (nodes/pg_list.h) — linked list; elements can be pointers, ints, or OIDs
Bitmapset (nodes/bitmapset.h) — compact set of non-negative integers (used for column/rel sets)


utils/ — Utilities

utils/cache/ — Catalog Cache & Relation Cache

Caches frequently-accessed catalog rows to avoid repeated syscat scans.

catcache.c — SearchSysCache(): hash cache of individual catalog tuples, invalidated by SI messages
relcache.c — RelationIdGetRelation(): relation descriptor cache; holds RelationData structs
inval.c — cache invalidation via shared-memory message queue (SI = shared invalidation)

Key structs:

RelationData (utils/rel.h) — relation descriptor: file locator, TupleDesc, index list, trigger list, locking info, rule info — the central struct for any open table or index
TupleDesc (access/tupdesc.h) — array of Form_pg_attribute; describes a row's columns

utils/mmgr/ — Memory Context Manager

PostgreSQL almost never calls malloc directly. Everything goes through memory contexts, which are arena allocators arranged in a tree. Freeing a context frees all its children. Key context lifetimes: TopMemoryContext (forever), MessageContext (per query), PortalContext (per portal), CurrentMemoryContext (current).

aset.c — AllocSet: the default allocator (slab-based with size classes)
mcxt.c — context tree management: AllocSetContextCreate(), MemoryContextSwitchTo(), pfree(), palloc()

Key struct:

MemoryContextData (nodes/memnodes.h) — base struct for all context types; forms a parent/child tree

utils/error/ — Error Handling

elog() / ereport() macros build an error report with ErrorData and longjmp to the nearest PG_CATCH handler.

elog.c — errstart(), errfinish(), elog_finish()
pg_lzcompress.c — TOAST compression

utils/fmgr/ — Function Manager

How PostgreSQL calls SQL functions from C. The V1 call convention wraps arguments in a FunctionCallInfoBaseData struct.

fmgr.c — fmgr_info() (look up function), FunctionCallInvoke() (call it)

Key structs:

FmgrInfo — cached function lookup: function pointer, type OID, nargs
FunctionCallInfoBaseData — call frame: flinfo, nargs, args[] (each a NullableDatum)

utils/adt/ — Built-in Data Types

Implementations of all built-in types: int, float, text, bytea, date, timestamp, numeric, json, jsonb, array, range, etc. Each type provides input/output functions and operators.
utils/sort/ — Sorting & Hashing


tuplesort.c — external merge sort used by ORDER BY, CREATE INDEX, DISTINCT
tuplestore.c — in-memory (spill-to-disk) tuple buffer used by CTEs, window functions
logtape.c — virtual tape file for merge sort spill

utils/time/ — MVCC Snapshot & Visibility


snapmgr.c — GetTransactionSnapshot(), GetLatestSnapshot(), snapshot export/import
combocid.c — combo command IDs for tuples inserted and deleted in the same transaction

utils/misc/ — GUC (Configuration)


guc.c, guc_tables.c — Grand Unified Configuration: all postgresql.conf parameters, SET, SHOW

utils/activity/ — Statistics & Wait Events


pgstat.c — statistics collector infrastructure
wait_event.c — wait event definitions used in pg_stat_activity.wait_event

utils/resowner/ — Resource Owner

Tracks resources (buffer pins, lock holds, file handles, tuplestore refs) that must be released on error. Arranged in a tree parallel to transaction/portal structure.

replication/ — Replication


walsender.c — streams WAL to standby; handles replication protocol
walreceiver.c — receives WAL on standby; writes to local WAL files
logical/ — logical decoding (change data capture): decodes WAL back into row changes
slot.c — replication slots: prevent WAL removal until a consumer has caught up


libpq/ — Wire Protocol (Server Side)

Implements the PostgreSQL frontend/backend protocol (version 3). Handles authentication, SSL, message framing.

auth.c — authentication methods (md5, scram, gss, cert, …)
pqcomm.c — socket I/O, message send/receive
pqformat.c — pq_sendint(), pq_sendstring(), etc.


jit/ — JIT Compilation

LLVM-based JIT for expression evaluation and tuple deforming. Activated for long-running queries where the compilation cost is worth it.

jit.c — provider interface
llvm/ — LLVM backend: llvmjit.c, llvmjit_expr.c (compile ExprState to native code)


statistics/ — Extended Statistics

Tracks multi-column statistics (ndistinct, functional dependencies, MCV lists) stored in pg_statistic_ext. Used by the planner to get better row count estimates when columns are correlated.

src/include/ — Header Files

Headers are organized to mirror src/backend/:


Header
What it defines


nodes/nodes.h
NodeTag enum; makeNode() macro


nodes/parsenodes.h
Query, RangeTblEntry, TargetEntry, all parse tree node types


nodes/primnodes.h
Expression nodes: Var, Const, FuncExpr, OpExpr, SubLink, etc.


nodes/plannodes.h
PlannedStmt, Plan and all plan node subtypes


nodes/pathnodes.h
PlannerInfo, RelOptInfo, Path and subtypes


nodes/execnodes.h
EState, PlanState subtypes, ExprContext, TupleTableSlot


access/htup_details.h
HeapTupleHeaderData (on-disk tuple header layout)


access/xact.h
Transaction state, isolation levels, StartTransaction etc.


storage/bufmgr.h
ReadBuffer(), Buffer type, buffer access API


storage/bufpage.h
PageHeaderData, ItemIdData, page layout macros


storage/lock.h
LOCK, PROCLOCK, lock mode constants


storage/proc.h
PGPROC — per-backend shared memory struct


utils/rel.h
RelationData — relation descriptor


utils/snapshot.h
SnapshotData — MVCC snapshot


utils/memutils.h
Memory context macros (palloc, pfree, MemoryContextSwitchTo)


Key Data Structures: Quick Reference


Struct
File
Description


Query
nodes/parsenodes.h
Output of semantic analysis; input to planner


PlannedStmt
nodes/plannodes.h
Output of planner; input to executor


PlannerInfo
nodes/pathnodes.h
Planner's per-query-level working state


RelOptInfo
nodes/pathnodes.h
A base or join relation during planning


Path
nodes/pathnodes.h
An access strategy under consideration


EState
nodes/execnodes.h
Executor's per-query state


PlanState
nodes/execnodes.h
Executor's per-plan-node state


TupleTableSlot
nodes/execnodes.h
An in-memory tuple being processed


HeapTupleHeaderData
access/htup_details.h
On-disk tuple header (MVCC fields)


RelationData
utils/rel.h
Relation descriptor (relcache entry)


TupleDesc
access/tupdesc.h
Column definitions for a relation or expression


BufferDesc
storage/buf_internals.h
Buffer pool slot descriptor


PGPROC
storage/proc.h
Per-backend shared memory block


SnapshotData
utils/snapshot.h
MVCC visibility snapshot


MemoryContextData
nodes/memnodes.h
Memory context (arena allocator)


List
nodes/pg_list.h
Generic singly-linked list


Good Starting Points for Deep Dives


Topic
Start here


How a SELECT executes end-to-end
tcop/postgres.c:exec_simple_query()


How the planner picks a join order
optimizer/README, then optimizer/path/allpaths.c


How cost estimation works
optimizer/path/costsize.c


How MVCC visibility is checked
access/heap/heapam_visibility.c


How a page is read from disk
storage/buffer/bufmgr.c:ReadBuffer()


How WAL write works
access/transam/xlog.c:XLogInsert()


How B-tree index scan works
access/nbtree/nbtsearch.c


How locks prevent conflicts
storage/lmgr/lock.c:LockAcquire()


How memory is managed
utils/mmgr/aset.c, utils/mmgr/mcxt.c


How expressions are evaluated
executor/execExprInterp.c


How VACUUM removes dead tuples
commands/vacuumlazy.c


How replication works
replication/walsender.c, replication/README


Concepts Worth Understanding First


MVCC — Every row stores xmin (inserting XID) and xmax (deleting XID). A snapshot's visibility check reads these against its own xmin/xmax/xip[]. No locks needed for reads.


Node trees — Parse trees, plan trees, and expression trees are all trees of Node-derived structs. The NodeTag in the first field tells you what struct it really is. Use IsA(node, Type) to check.


Memory contexts — Never call free() directly. Every allocation is in a context. When a query ends, MemoryContextDelete(MessageContext) frees everything at once.


WAL-first — Every change to a data page must write a WAL record before modifying shared buffers. This ensures crash recovery. XLogInsert() + MarkBufferDirty() is the pattern.


The buffer pin — Before touching any page you must ReadBuffer() (pin it). The buffer manager won't evict a pinned buffer. ReleaseBuffer() releases the pin.


Resource owners — Pin counts, lock holds, etc. are registered with a ResourceOwner. On ERROR, the resource owner cleanup releases everything even if the code didn't explicitly do so.


Demand-pull execution — ExecProcNode(node) returns one tuple. The top of the tree is called by the portal loop; it calls its children; leaves call the storage layer. No tuple is computed until needed.
Header	What it defines
`nodes/nodes.h`	`NodeTag` enum; `makeNode()` macro
`nodes/parsenodes.h`	`Query`, `RangeTblEntry`, `TargetEntry`, all parse tree node types
`nodes/primnodes.h`	Expression nodes: `Var`, `Const`, `FuncExpr`, `OpExpr`, `SubLink`, etc.
`nodes/plannodes.h`	`PlannedStmt`, `Plan` and all plan node subtypes
`nodes/pathnodes.h`	`PlannerInfo`, `RelOptInfo`, `Path` and subtypes
`nodes/execnodes.h`	`EState`, `PlanState` subtypes, `ExprContext`, `TupleTableSlot`
`access/htup_details.h`	`HeapTupleHeaderData` (on-disk tuple header layout)
`access/xact.h`	Transaction state, isolation levels, `StartTransaction` etc.
`storage/bufmgr.h`	`ReadBuffer()`, `Buffer` type, buffer access API
`storage/bufpage.h`	`PageHeaderData`, `ItemIdData`, page layout macros
`storage/lock.h`	`LOCK`, `PROCLOCK`, lock mode constants
`storage/proc.h`	`PGPROC` — per-backend shared memory struct
`utils/rel.h`	`RelationData` — relation descriptor
`utils/snapshot.h`	`SnapshotData` — MVCC snapshot
`utils/memutils.h`	Memory context macros (`palloc`, `pfree`, `MemoryContextSwitchTo`)
Struct	File	Description
`Query`	`nodes/parsenodes.h`	Output of semantic analysis; input to planner
`PlannedStmt`	`nodes/plannodes.h`	Output of planner; input to executor
`PlannerInfo`	`nodes/pathnodes.h`	Planner's per-query-level working state
`RelOptInfo`	`nodes/pathnodes.h`	A base or join relation during planning
`Path`	`nodes/pathnodes.h`	An access strategy under consideration
`EState`	`nodes/execnodes.h`	Executor's per-query state
`PlanState`	`nodes/execnodes.h`	Executor's per-plan-node state
`TupleTableSlot`	`nodes/execnodes.h`	An in-memory tuple being processed
`HeapTupleHeaderData`	`access/htup_details.h`	On-disk tuple header (MVCC fields)
`RelationData`	`utils/rel.h`	Relation descriptor (relcache entry)
`TupleDesc`	`access/tupdesc.h`	Column definitions for a relation or expression
`BufferDesc`	`storage/buf_internals.h`	Buffer pool slot descriptor
`PGPROC`	`storage/proc.h`	Per-backend shared memory block
`SnapshotData`	`utils/snapshot.h`	MVCC visibility snapshot
`MemoryContextData`	`nodes/memnodes.h`	Memory context (arena allocator)
`List`	`nodes/pg_list.h`	Generic singly-linked list
Topic	Start here
How a SELECT executes end-to-end	`tcop/postgres.c:exec_simple_query()`
How the planner picks a join order	`optimizer/README`, then `optimizer/path/allpaths.c`
How cost estimation works	`optimizer/path/costsize.c`
How MVCC visibility is checked	`access/heap/heapam_visibility.c`
How a page is read from disk	`storage/buffer/bufmgr.c:ReadBuffer()`
How WAL write works	`access/transam/xlog.c:XLogInsert()`
How B-tree index scan works	`access/nbtree/nbtsearch.c`
How locks prevent conflicts	`storage/lmgr/lock.c:LockAcquire()`
How memory is managed	`utils/mmgr/aset.c`, `utils/mmgr/mcxt.c`
How expressions are evaluated	`executor/execExprInterp.c`
How VACUUM removes dead tuples	`commands/vacuumlazy.c`
How replication works	`replication/walsender.c`, `replication/README`