Skip to content

Instantly share code, notes, and snippets.

@mrkybe
Last active February 25, 2026 04:40
Show Gist options
  • Select an option

  • Save mrkybe/bd9ec3fb8d73d4116899304eb0ff0f8a to your computer and use it in GitHub Desktop.

Select an option

Save mrkybe/bd9ec3fb8d73d4116899304eb0ff0f8a to your computer and use it in GitHub Desktop.
What I actually want a Game Engine to handle for me: ie, make it easy to do the right thing, make it hard to block the render thread

What I actually want a Game Engine to handle for me: ie, make it easy to do the right thing, make it hard to block the render thread

Core Abstraction: Grid Layers

The central type is a GridLayer — a chunk-partitioned spatial data structure, generic over cell type, that the engine manages the lifecycle of. Each layer declares its tradeoff profile upfront so the engine can make scheduling decisions.

app.register_grid_layer::<Terrain>(GridLayerConfig {
    chunk_size: UVec2::new(16, 16),
    density: Density::Dense,
    mutation: MutationHint::Rare,
    loader: some_async_loader,
});

app.register_grid_layer::<Items>(GridLayerConfig {
    chunk_size: UVec2::new(16, 16),
    density: Density::Sparse,
    mutation: MutationHint::Frequent,
    loader: some_other_loader,
});

// more realistically:
GridLayer::<Terrain>::builder()
    .chunk_size(UVec2::new(16, 16))
    .dense()
    .mutation_hint(MutationHint::Rare)
    .load_radius(3)
    .unload_radius(5)
    .max_loaded_chunks(256)
    .max_inflight_loads(8)
    .default_value(Terrain::Void)
    .dirty_propagation(Neighbors::Eight) // for autotiling
    .loader(terrain_loader)
    .build()

// with sane defaults:
GridLayer::<Terrain>::builder()
    .dense()
    .loader(terrain_loader)
    .build()

Why generic over T: The renderer needs u16 sprite indices, gameplay needs terrain enums, pathfinding needs cost values. Same spatial structure, different payloads. Making T generic means the engine manages chunk lifecycle and spatial scheduling without knowing what's in the cells. The user provides a mapping to visuals separately (or not at all — a pathfinding cost grid has no render representation).

Why MutationHint and not a hard contract: The engine uses this for scheduling heuristics (how aggressively to dirty-check, whether to double-buffer), but gameplay code can always mutate whenever it wants. If you declared Rare and then mutate every frame, performance degrades gracefully rather than violating an invariant. The hint is advisory, not a type-level constraint. The alternative is an Immutable / Mutable type-level split, which is cleaner but means you can't have a layer that's usually static but occasionally explodes (Noita).

Render Integration

Render config is separate from the layer declaration. A layer can exist with no visual representation at all.

app.register_grid_render::<Terrain>(GridRenderConfig {
    strategy: RenderStrategy::ChunkBake,
    sprite_map: terrain_to_sprite, // fn(&Terrain) -> SpriteIndex
    staleness_tolerance: Duration::ZERO,
    z_layer: 0.0,
});

app.register_grid_render::<Items>(GridRenderConfig {
    strategy: RenderStrategy::PerTileSprite,
    sprite_map: items_to_sprite,
    staleness_tolerance: Duration::ZERO,
    z_layer: 1.0,
});

// No render registration for PathCost — it's gameplay-only

Why separate from layer declaration: The essay's whole argument is that storage and rendering are orthogonal. A dense grid of terrain enums is useful to gameplay even if you never draw it. And you might want multiple render representations of the same layer (full-detail view + minimap). Welding them together is the mistake current TileMap components make.

RenderStrategy is the hot/cold split made explicit:

  • ChunkBake — the engine batches the chunk into a mesh/texture. Read-optimized. The engine schedules rebakes when the chunk dirties, respecting staleness_tolerance.
  • PerTileSprite — each occupied cell gets a sprite entity. Write-optimized. No batching, no rebake cost, but more draw calls.
  • The user picks this per-layer instead of discovering by accident that animated tiles break their chunk bake.

staleness_tolerance drives render scheduling: A fog-of-war overlay that tolerates 100ms staleness lets the engine defer its rebake and batch it with other work. A terrain layer with zero tolerance rebakes the same frame it dirties. The engine uses this to prioritize GPU upload bandwidth — zero-tolerance layers go first, tolerant ones fill remaining budget.

The sprite_map function is where autotiling lives. And this is where it gets tricky. A simple fn(&T) -> SpriteIndex works for basic tiles but autotiling needs neighbor context. So the real signature is probably:

sprite_map: fn(&GridNeighborhood<T>) -> SpriteIndex

where GridNeighborhood gives you the cell and its 8 neighbors. The engine provides this during bake — it already has the chunk data loaded. But: neighbors at chunk edges require the adjacent chunk to be loaded. The engine either needs to guarantee neighbor chunks are loaded before baking (ordering dependency in what's supposed to be async), or the sprite_map needs to handle Option<&T> for neighbors that aren't loaded yet. Both are annoying. The ordering dependency is probably the right call since you don't want visual pop-in at chunk seams, but it means the engine's chunk scheduling needs to understand this dependency graph.

Chunk Loading / Async Streaming

// User provides a loader — the engine calls it when a chunk needs data
fn terrain_loader(chunk_coord: IVec2) -> impl Future<Output = ChunkData<Terrain>> {
    async move {
        let bytes = read_from_disk(chunk_coord).await;
        deserialize_terrain_chunk(bytes)
    }
}

Why the engine owns the scheduling: The engine knows which chunks are near the camera, which are approaching based on movement, which are visible but far. The user shouldn't be manually prioritizing load requests — that's wiring up spatial relevance signals by hand, which is exactly what the essay argues against.

Chunk lifecycle states:

Unloaded → Loading → Loaded → Dirty → Rebaking → Clean
                                  ↑________|

The engine manages these transitions. Unloaded chunks have no data — queries against them return None (or a summary, see below). Loading means an async task is in flight. Loaded means data is in memory. Dirty means data changed since last render sync. The render system only touches Loaded/Dirty/Clean chunks — it never blocks on Loading.

Cancellation: If the player reverses direction, in-flight loads for chunks they're moving away from should be cancellable. This means the loader future needs to be cancel-safe (in Rust, dropping the future cancels it, which is fine for disk IO but tricky if the loader has side effects like decompression into a shared buffer).

Eviction: The engine needs a policy for unloading chunks when memory pressure hits. LRU by last access time is the obvious default, but the user might need to pin chunks (the chunk the player is standing in should never evict). Something like:

app.register_grid_layer::<Terrain>(GridLayerConfig {
    // ...
    eviction: EvictionPolicy::LRU { max_loaded_chunks: 256 },
});

The summary problem: The essay's staleness section argues that "unloaded" is just the extreme end of the staleness spectrum. Some consumers need something for unloaded regions — CDDA's overmap, a pathfinding passability summary, a minimap color. This suggests a two-tier storage:

app.register_grid_layer::<Terrain>(GridLayerConfig {
    // ...
    summary: Some(SummaryConfig {
        type: TerrainSummary, // coarser type, stays in memory
        summarize: fn(ChunkData<Terrain>) -> ChunkSummary<TerrainSummary>,
    }),
});

The summary persists after eviction and can be queried. Concern: this adds significant API complexity for something not every game needs. Maybe it's a separate opt-in extension rather than core to GridLayerConfig. But if you don't build it in, every large-world game reinvents it.

Query API

fn gameplay_system(terrain: GridQuery<Terrain>, items: GridQuery<Items>) {
    // Point lookup
    let tile: Option<&Terrain> = terrain.get(IVec2::new(5, 3));
    
    // Regional iteration
    for (coord, terrain) in terrain.iter_region(bounds) {
        // ...
    }
    
    // Multi-layer lookup
    if let (Some(t), Some(i)) = (terrain.get(pos), items.get(pos)) {
        // ...
    }
}

Why Option<&T> everywhere: Chunks might not be loaded. The query API forces you to handle this at every call site. This is annoying but correct — the alternative is panicking on unloaded access (dangerous) or silently returning a default (subtle bugs). Rust's type system makes this the natural choice.

Concern: ergonomics of pervasive Option. Every single grid access being Option<&T> is noisy. In practice, gameplay code usually runs inside the "reality bubble" where chunks are guaranteed loaded. So you probably want a scoped API:

fn gameplay_system(terrain: GridQuery<Terrain>) {
    // Scoped access — panics if any chunk in the region isn't loaded
    // Engine guarantees this for the active simulation region
    let region = terrain.expect_region(simulation_bounds);
    let tile: &Terrain = region.get(IVec2::new(5, 3)); // no Option
}

This is the engine providing a spatial relevance guarantee: "I promise these chunks are loaded because they're in the simulation region." Outside that region, you're back to Option. The boundary between guaranteed and best-effort is explicit in the type system.

Cross-layer queries are inherently multiple lookups. The essay argues layers are separate because they have different tradeoff profiles. The cost is that "what's at (5, 3)" is N lookups instead of one struct access. In practice this is probably fine — they're all O(1) hash/array lookups in the same cache neighborhood — but if profiling shows it's hot, you'd want a derived/materialized view that combines layers. Which brings us to:

Derived Layers

Some layers aren't authored — they're computed from other layers.

app.register_derived_layer::<PathCost>(DerivedLayerConfig {
    sources: (TypeId::of::<Terrain>(), TypeId::of::<Furniture>()),
    compute: |terrain: &Terrain, furniture: Option<&Furniture>| -> PathCost {
        compute_pathfinding_cost(terrain, furniture)
    },
    staleness_tolerance: Duration::from_millis(200),
    chunk_size: UVec2::new(32, 32), // can differ from source layers
});

Why this matters: Pathfinding cost, room temperature, scent propagation — these are all derived from primary layers but have their own staleness tolerances and chunk sizes. The engine can schedule recomputation intelligently: if terrain hasn't changed and the tolerance hasn't expired, don't recompute.

Concern: dependency tracking complexity. If PathCost depends on Terrain and Furniture, and Terrain in chunk (3, 2) dirties, which PathCost chunks need recomputation? If chunk sizes differ between source and derived layers, one source chunk might overlap multiple derived chunks. The engine needs a dependency graph with spatial overlap resolution. This is tractable but not trivial — it's essentially incremental computation over spatial data, and getting it wrong means either stale derived data or wasted recomputation.

Concern: the chunk_size mismatch. Allowing derived layers to have different chunk sizes from their sources is powerful (pathfinding might want bigger chunks for fewer overhead) but complicates the mapping. A 32×32 PathCost chunk depends on four 16×16 Terrain chunks. If one of those four isn't loaded yet, the PathCost chunk can't be fully computed. Partial computation? Block until all sources are loaded? Return the stale value? Every choice has implications.

The Hard Problems

1. Render thread blocking

The whole premise is "don't block the render thread." In Bevy, the render world extracts from the main world. For grid layers, this means:

  • ChunkBake layers: The engine needs to upload baked meshes/textures to the render world. If a chunk dirties and needs rebaking, that rebake should happen on an async task, not during extract. The render world keeps displaying the old mesh until the new one is ready. This means double-buffering chunk render data — memory cost, but the render thread never stalls.
  • PerTileSprite layers: Sprite entities are already in the ECS, so extract handles them normally. But spawning/despawning thousands of sprite entities when a chunk loads or unloads could cause a frame spike. Batching entity operations across frames (load N tiles per frame) trades latency for smoothness.

The engine should handle both of these automatically based on the declared render strategy. The user should never be in a position where they accidentally block extract by mutating a grid.

2. Bevy's entity model vs. chunk data

Should chunks be entities? Probably yes — Entity with a GridChunk<T> component, so you can query them, add markers, etc. But cells should not be entities (except for PerTileSprite render, where the engine spawns sprite entities as an implementation detail). The essay's "entity-per-tile is expensive" point means the primary data lives in the chunk component as a flat array or hash map, not in the ECS.

Tension with Bevy's philosophy: Bevy is heavily ECS-oriented. "Data should be components on entities" is a core principle. Grid cells as non-entity data inside a chunk component is an escape from that principle, justified by performance. This is defensible but will get pushback from people who want to attach arbitrary components to individual tiles. You'd need to provide a blessed pattern for per-tile ECS data (sparse component maps indexed by coordinate?) that interops with the grid layer without requiring an entity per cell.

3. What about entities that live on the grid?

CDDA has terrain (dense, static), furniture (sparse, semi-static), items (sparse, mutable), and monsters/NPCs (sparse, moving every turn). The first three fit the grid layer model. Monsters don't — they're entities that happen to be on the grid, not grid data. But gameplay constantly asks "what entity is at (5, 3)?"

You need a spatial index over entities that shares the grid coordinate system but isn't a grid layer. Something like:

fn system(spatial: GridSpatialIndex<Monster>, terrain: GridQuery<Terrain>) {
    let monster: Option<Entity> = spatial.get(pos);
    let tile: Option<&Terrain> = terrain.get(pos);
}

This is not a GridLayer — it doesn't own the entities, it's an index that stays in sync as entities move. But it shares coordinate space and chunk boundaries. Whether this should be part of the grid API or a separate spatial indexing crate that happens to interop is an open design question. Bundling it risks re-creating the monolith. Separating it risks "everyone rolls their own and it's always slightly wrong."

4. Coordinate system flexibility

The essay mentions hex and iso as coordinate transforms, not rendering modes. The API needs to be generic over coordinate systems without making the common case (square orthographic) pay for it. Probably:

app.register_grid_coordinate_system(CoordinateSystem::Orthographic {
    tile_size: Vec2::new(32.0, 32.0),
});
// or
app.register_grid_coordinate_system(CoordinateSystem::Isometric {
    tile_size: Vec2::new(64.0, 32.0),
});

This is a global transform that all layers share — you probably don't want hex terrain with orthographic items. The engine uses it for world↔grid coordinate conversion, render positioning, and mouse picking. Concern: "global" breaks down if you have multiple grids (overworld + dungeon interior with different tile sizes). Probably needs to be per-grid-instance rather than truly global, which adds another layer of indirection.

5. Serialization

Chunk data needs to be serializable for save/load. If T is generic, the user provides Serialize/Deserialize impls. But the engine also needs to serialize chunk metadata (dirty flags, load state, summary data). The save format needs to be chunk-granular so you can save/load individual chunks without touching the whole map — which matters for streaming saves in large worlds.

Should the engine own the save format? Probably not — that's too opinionated. But it should provide hooks: "here's every loaded chunk and its data, do what you want." And on load: "here's chunk data for coordinate X, ingest it." The async loader already handles the load side. Save is the reverse — an async writer that the engine feeds dirty chunks to.

6. Thread safety and mutation

Gameplay systems in Bevy can run in parallel. Two systems mutating the same grid layer on different chunks is fine (different data). Two systems mutating the same chunk is a data race. Bevy's existing Query system handles this for entities via access conflict detection. GridQuery<T> needs the same treatment — mutable access to a layer should be exclusive, or chunk-granular locking should allow parallel writes to different chunks.

Chunk-granular locking is more performant but way harder to implement correctly, especially with cross-chunk operations (an explosion that damages a 5-tile radius spanning two chunks needs to lock both). Exclusive mutable access per layer is the safe starting point, with chunk-granular parallelism as a future optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment