AI write good but AI write a lot.
Readers different contexts, skill levels, experience. esp in game dev.
So over-explain and hit on head good?
Depends.
Me read sections, me think mostly ok, not making haiku.
Communicate Math/CS concepts hard, don't enjoy doing.
I'm building a Rust/Bevy reimplementation of Cataclysm: Dark Days Ahead. While baking overmap tiles into chunk textures, I hit CDDA's oversized terrain tiles — trees that don't fit the 32×32 grid. So I bake the standard ones and spawn the oversized ones as regular entities. Two rendering paths for the same logical layer.
This keeps happening. I had the same problem in Unity on other projects. Once you build a real game on top of a "tilemap system," the abstraction fragments almost immediately. I think that's because "tilemap" isn't actually one thing.
When someone says "we need a tilemap system," they're bundling at least four orthogonal concerns: a rendering batcher (sprites from an atlas onto a grid), a spatial data grid (O(1) lookup by coordinate, terrain enums, collision flags), a chunk/streaming system (memory management, dirty-flagging, serialization), and an authoring abstraction (tile palettes, paintbrush editors, rule-based autotiling).
thesis statement vvvvvvvvvvvvvv
What you actually have in most real games is a grid-aligned spatial convention that multiple independent systems happen to share. The grid is the common ground. "Tilemap" implies a unified thing that doesn't exist once complexity shows up.
The pitch is: "hand us your grid data, we'll render it efficiently." But "efficiently" depends on questions the system can't answer generically. How often does the data change? Are tiles uniform size? How many layers? Animated tiles? Per-tile lighting?
The answers determine whether you want a baked texture atlas, GPU-instanced meshes, a vertex buffer rebuilt per chunk, a compute shader splatting indices, or just raw sprite entities. So the renderer can't be generic over "tilemap" — it has to know intimate details about the data. Your tilemap system is actually a specific renderer for a specific kind of grid data wearing a familiar name.
this where concept of stacking tilemaps come
And it leaks immediately. Oversized tiles don't render correctly. Animated tiles flicker when the chunk rebakes. You need per-tile opacity for fog of war but the batch doesn't support it. The maintainer either adds special cases until the system is a forest of flags, or says "that's not supported, use regular sprites for those." Either way, the abstraction has a boundary — it's just never documented because acknowledging it would undermine the premise.
The engine-level TileMap component that tries to own both data and rendering is two or three systems in a trenchcoat.
maybe databases developers not boring accountants
These are the same tensions databases have been solving for decades. You have a big collection of structured records that multiple consumers need to access with different patterns. The renderer wants contiguous chunks in draw order. Gameplay wants point lookups. Pathfinding wants rectangular region iteration. Serialization wants to stream by chunk.
Databases solved this by separating the logical model (tables, rows, queries) from the physical model (page layout, indexes, caching, materialized views). A tilemap system that tries to be "one thing" refuses to make that separation. It's a database where the table layout is the index is the query plan.
Database theory has the RUM conjecture: optimize for Read, Update, or Memory — pick two.
- Fast reads + low memory → expensive updates. Sorted arrays. Binary search is fast, storage is compact, inserting means shifting everything.
- Fast reads + fast updates → high memory. Hash maps with extra indexes, multiple materialized views.
- Fast updates + low memory → slow reads. Unsorted append log. Writes are trivial, finding anything means scanning.
Every tilemap design picks a point in this triangle. A baked chunk texture is maximally read-optimized: one draw call, zero per-tile overhead, but changing one tile means rebaking. Entity-per-tile optimizes for updates but costs memory and slows bulk reads. A flat array of tile indices is memory-efficient and great for iteration, but awkward once you need per-tile components or sparse data.
Nobody frames it this way, so people argue about which design is "right" when they just have different workloads.
same as above section, skip
Chunks exist because rebuilding one giant mesh for the entire map on every tile change is unacceptable. Partition the grid, rebuild only dirty chunks. The tradeoff between chunk size and draw call count only makes sense if tiles change infrequently — the rebuild cost amortizes over many clean frames.
An animated tile breaks this. Nobody wrote to the tile. No game event happened. But the chunk is dirty every frame because the visual representation is a function of time. So you either rebake the chunk every frame (defeating the purpose), shrink chunks (punishing static tiles to accommodate animated ones), or pull animated tiles out of the chunk system entirely.
Every engine ends up at option three. That's a hot/cold partition: one logical dataset split into two physical representations based on update frequency. Oversized tiles do the same thing for a different reason — they escape because they violate the spatial assumption (uniform cell size) rather than the temporal one. But the pattern is identical: the "tilemap" keeps shedding special cases into separate systems until what's left is just boring static uniform tiles. And for those, a flat texture bake was always the obvious answer. No "system" required.
"oh but just" - ok you sit in gamedev discord and explain to every new gamedev who tries to make minecraft by spawning GameObjects
Forget tilemaps. Model the real dimensions, starting from the hardware up.
The GPU wants tightly packed buffers accessed in parallel. The CPU wants cache-line-friendly structures with random access. Crossing the bus is expensive. A baked chunk texture lives on the GPU — great for rendering, useless for gameplay queries without a readback. An ECS component array lives on the CPU — great for queries, but the renderer has to upload changes.
Dwarf Fortress was pure CPU with no GPU tilemap until the Steam release bolted one on. Factorio has a CPU-authoritative sim with a separate GPU-optimized render layer.
Dense (flat array) gives O(1) lookup and perfect cache coherence but wastes memory on empty cells. Sparse (hash map, sorted list) scales with occupancy but fragments access patterns.
Minecraft is dense — every block in a 16³ section exists, even air. CDDA's terrain is fully dense, but its item layer is extremely sparse. Terraria is dense for its block grid but sparse for entity placement.
RPG Maker maps are nearly immutable at runtime — authored in an editor, loaded wholesale, extreme read-optimization is obvious. Factorio's belt system mutates every tick as items move, so sim storage is write-optimized. Spelunky and Noita are an interesting middle: terrain is static until something explodes, then large regions mutate at once, favoring chunked write batches.
When one tile changes, how much work do you redo? Per-tile: cheap updates, more bookkeeping, potentially more draw calls. Per-chunk: batched cost, but overpaying on small changes. Independent of mutation cost — this is about the blast radius of a single write.
Not every consumer needs the current truth. This is a spectrum, not a binary. Collision checks need frame-accurate data — a stale collision grid kills gameplay. Pathfinding can recompute costs from a snapshot every N ticks. A minimap can lag a few frames. Fog of war reveal state can batch-update once per turn. And at the far end: the data isn't in memory at all, and the consumer is working from a coarse summary of what's on disk.
This is the database read-replica problem. A consumer that tolerates staleness can read from a snapshot while the authoritative data mutates freely, and a chunk that doesn't need to rebake immediately can defer to next frame or next N frames. Staleness tolerance changes the entire invalidation cost calculus — the blast radius matters less when you control when you pay for it.
Factorio's pollution/logistics overlays update on slower tick rates than the core belt simulation. CDDA's scent map propagates on a schedule rather than per-action. CDDA's overmap is the extreme case: a permanent coarse summary that outlives the fully-loaded data, giving AI and map rendering something to work with when the reality bubble has moved on and the real tile data is back on disk. All three are accepting different degrees of staleness to avoid coupling every consumer to every write.
netcode also exists, much voodoo involved
Is each cell a u16 sprite index or arbitrary typed metadata? Partly constrained by authority — the GPU can't work with arbitrary component data. Thin data packs better and can live on either side. Rich data is CPU-only but enables gameplay logic without indirection.
Super Mario Bros: tile index is the entire truth, collision is derived from index. Rimworld: each cell has terrain type, roof, zone, room assignment, beauty score, temperature.
Row-major is simple but cache-hostile for vertical/diagonal traversals. Z-order (Morton) or Hilbert curves keep 2D neighbors closer in memory, which matters for pathfinding, flood fills, and GPU texture sampling (which uses space-filling curves internally via tile swizzle hardware).
Most 2D games use row-major and never think about it. Minecraft uses Y-sliced sections (XZY order) because vertical neighbor access is critical for lighting propagation. Voxel engines like Teardown use octrees/SVOs where a cell's address encodes spatial locality.
Nearly every engine's built-in tilemap occupies the same narrow point:
| Engine | Authority | Density | Mutation | Invalidation | Semantics | Spatial |
|---|---|---|---|---|---|---|
| Unity Tilemap | CPU → GPU | Dense/chunk | Read-opt (mesh bake) | Per-chunk | Index + flags | Row-major |
| Godot 4 TileMap | CPU → GPU | Sparse (quadrants) | Read-biased (quadrant rebake) | Per-quadrant | Index + custom data | Quadrant-grouped |
| GameMaker Tile Layer | GPU-leaning | Dense/layer | Read-opt | Per-tile | Index-only (visual) | Row-major |
| RPG Maker Map | CPU → GPU | Dense (fixed) | Maximally read-opt | Full map | Index + autotile | Row-major |
| Unreal Paper2D | CPU → GPU | Dense | Read-opt | Per-chunk | Index | Row-major |
| Bevy (current) | CPU → GPU | Dense/chunk | Read-opt (mesh bake) | Per-chunk | Index | Row-major |
They're all basically the same product: a baked chunk mesh renderer for uniform 2D grids.
Every engine that ships a real grid-based game quietly builds systems at different tradeoff points:
| Engine | System | What It Actually Is |
|---|---|---|
| Unity | DOTS/ECS | CPU, write-friendly, arbitrary data — what you actually use for gameplay tile data |
| Unity | Terrain | GPU heightmap+splatmap — a tilemap for 3D landscapes that nobody calls one |
| Godot | GridMap | 3D tilemap with explicitly different tradeoffs (sparse, octree) |
| Godot | MultiMeshInstance | GPU-driven escape hatch for when TileMap rendering can't keep up |
| Unreal | World Partition | Sparse, chunk-granular streaming — solves what tilemaps punt on |
| Unreal | ISM/HISM | GPU instancing — what you actually use instead of Paper2D for large grids |
| GameMaker | ds_grid | CPU-only, dense, write-friendly, arbitrary data — the actual gameplay data layer |
| Factorio | Chunk system | Explicitly split: render layer is read-opt bake, sim layer is write-opt per-tick |
| Minecraft | Chunk sections | Dense 16³, palette-compressed, greedy-meshed, per-section rebuild |
| Dwarf Fortress | Map blocks | CPU-only, dense, rich structs, no GPU tilemap until Steam bolted one on |
The Alternative - (all of these already exist actually and programmers compose them all the time, what would a TileMap actually be?)
The useful framing: we need composable primitives for grid-aligned spatial data with different read/write/memory tradeoffs, and one of those primitives happens to be a batch sprite renderer.
What that could look like concretely:
-
A dense grid store — flat
Vec<T>with coordinate indexing, chunk-partitioned, generic over T. This is the workhorse for any layer where every cell has data: terrain, temperature, lighting. Generic over T because the renderer needsu16sprite indices, gameplay needs terrain enums, and pathfinding needs cost values — same spatial structure, different payloads. Chunk partitioning isn't a rendering concern here, it's a memory and serialization boundary. -
A sparse grid store — hash map with coordinate keys, for layers where most cells are empty. Items, furniture, traps, decorations. A flat array for CDDA's item layer would waste enormous memory when 95% of tiles have nothing on them. Sparse stores trade cache coherence for occupancy-proportional memory, which is the right call when density is low and access patterns are point lookups rather than scans.
-
An async chunk streamer — given a chunk coordinate, produce the data eventually. Handles load prioritization (near-player first), cancellation (player changed direction, drop the request), and the transition between unloaded, summarized, and fully-loaded states. This is what fills grid stores — not the stores themselves. Any world bigger than RAM needs this, and the policy for what unloaded regions are to gameplay is a design decision the streamer makes explicit rather than hiding.
-
A chunk baker — takes a dense grid of sprite indices, produces a mesh or texture. A rendering primitive, nothing more. It doesn't own the data, doesn't know about gameplay, doesn't manage streaming. It takes an input grid, outputs a renderable, and gets called again when the grid changes. Separating this from the grid store is the whole point — the bake strategy (texture atlas, vertex-colored mesh, GPU compute) can change without touching gameplay data.
-
per-tile GPU uploader (this just sprite component! but then also used for 'big sprites') — for layers that mutate frequently enough that rebaking chunks every frame defeats the purpose. This is just regular sprites. Every engine already has this, and every engine already reaches for it when the chunk bake breaks down — it's the escape hatch that proves the abstraction has a boundary. Naming it as an explicit primitive alongside the chunk baker just acknowledges what's already happening: sprites are the write-optimized endpoint, baked chunks are the read-optimized endpoint, and most real games need both. Animated tiles, real-time lighting updates, dynamic liquid flow — anything where the update frequency exceeds the amortization window of a chunk rebuild belongs here instead of fighting the bake system.
-
A spatial query interface over any grid store — regional iteration, neighbor lookup, coordinate mapping. Queries can hit unloaded regions and need a fallback policy: block, return a default, or return a coarse summary. The consumer declares its staleness tolerance, the query interface returns the best data available, and the streamer backfills. CDDA's overmap is exactly this pattern — a lightweight summary that stays loaded when the reality bubble moves away, giving pathfinding and map rendering something to work with while the real data is on disk.
-
Grid coordinate utilities — chunk↔tile math, hex/iso projections as coordinate transforms rather than rendering modes. These get bundled into tilemap systems as if they're inseparable from rendering, but they're pure math. A hex grid projection is a coordinate transform. An isometric projection is a coordinate transform. Neither has anything to do with how you store or draw tiles — they belong in a shared math layer that any primitive can use.
None of these need to know about each other. You compose them based on where your game sits in the tradeoff space. The streamer fills the stores, the baker reads from them, the query interface abstracts over loaded and unloaded regions, and each consumer's staleness tolerance determines whether it blocks, defers, or works from a summary.
editors note:
sorry for the AI slop