Skip to content

Instantly share code, notes, and snippets.

@mosure
Last active February 19, 2026 20:49
Show Gist options
  • Select an option

  • Save mosure/cd45bcd2b0331f8ebf6e754a23afdb3f to your computer and use it in GitHub Desktop.

Select an option

Save mosure/cd45bcd2b0331f8ebf6e754a23afdb3f to your computer and use it in GitHub Desktop.

ramdisk is a scaling primitive for local agent orchestration

local development used to be human-paced: one developer, one editor, occasional builds and tests. consumer ssd endurance assumptions were built around that pattern.

agent-driven development changes the load profile. when you run 4-32 local agents in parallel, each doing build, test, validation, and coding loops, write pressure scales horizontally just like cpu and memory demand.

graph LR;
    a["agent count"] --> e["daily host writes"];
    b["cycles per agent per day"] --> e;
    c["logical writes per cycle"] --> e;
    d["host overhead factor"] --> e;
    e --> f["annual host writes"];
    f --> g["TBW budget check"];
    g -->|over| h["move hot paths to ramdisk"];
    g -->|within| i["safe headroom to scale"];
Loading

scaling math

use a simple model:

daily_host_writes_GB = agents * cycles_per_agent_per_day * logical_writes_per_cycle_GB * host_overhead_factor

optional (for nand-wear reasoning, not TBW comparison):

daily_nand_writes_GB = daily_host_writes_GB * device_wa_factor

for continuous operation, define:

cycles_per_agent_per_day = cycles_per_hour * 24

typical write ranges per cycle in real dev loops:

task write range per cycle
build artifacts and object files 1-6 GB
test temp files and coverage/log output 0.2-2 GB
ml numerical correctness validation (intermediate tensors, traces, eval output) 1-10 GB
coding overhead (indexing, logs, git/object churn) 0.05-0.3 GB

moderate orchestrated setup:

  • agents = 8
  • cycles_per_agent_per_day = 20
  • logical_writes_per_cycle_GB = 5
  • host_overhead_factor = 1.3 (filesystem metadata, journaling, copy-on-write overhead)

result:

  • daily_host_writes_GB = 8 * 20 * 5 * 1.3 = 1040 GB/day
  • annual_host_writes_TB ~= 380 TB/year

compare that with endurance. if a 1 TB drive is rated at 600 TBW, a 10-year budget is:

600 TB / 3650 days ~= 0.164 TB/day ~= 164 GB/day

TBW means terabytes written: the total cumulative host writes the drive is rated to absorb over its warranted endurance life.

TBW is a warranty/rating figure, not a hard failure cliff. real endurance varies with workload and operating conditions, but TBW is still a useful planning budget.

consumer warranties are often much shorter than 10 years; the 10-year view is a planning horizon, not a warranty promise.

explicit 24/7 lifetime math

to make the continuous effect concrete, assume:

  • each agent runs 2 cycles/hour continuously
  • each cycle writes 2.5 GB
  • host_overhead_factor = 1.3
  • drive endurance is 600 TBW (common consumer class)

then:

  • cycles_per_agent_per_day = 2 * 24 = 48
  • daily_host_writes_per_agent_GB = 48 * 2.5 * 1.3 = 156 GB/day
  • daily_host_writes_GB = agents * daily_host_writes_per_agent_GB
  • ssd_lifetime_years = (drive_TBW * 1000) / (daily_host_writes_GB * 365)

note: calculations here use decimal storage units (the same convention used by drive vendors): 1 TB = 1000 GB.

when comparing to TBW, use host writes. do not multiply by device_wa_factor for the TBW check.

agents (24/7) daily writes annual writes lifetime for 600 TBW ssd
1 156 GB/day 56.9 TB/year 10.5 years
4 624 GB/day 227.8 TB/year 2.6 years
8 1248 GB/day 455.5 TB/year 1.3 years
16 2496 GB/day 911.0 TB/year 0.66 years (~8 months)

the key point: under a fixed per-agent workload, lifetime scales inversely with agent count (if agents double, expected lifetime roughly halves). contention, cache behavior, and io throttling can bend real-world results above or below this baseline.

why ramdisk matters

ramdisk (tmpfs) shifts high-churn ephemeral writes into dram:

  • near-zero ssd wear for throwaway artifacts
  • lower latency for compile/test loops
  • less io queue contention under concurrent agents

this is no longer a micro-optimization. it is a durability and throughput control. tmpfs is volatile and memory-backed, so size it conservatively to avoid swap pressure that can reintroduce ssd writes.

implicit vs explicit ramdisk

both ubuntu and macos already use an implicit in-memory file cache (page cache / unified buffer cache). this helps performance, but it is not the same as an explicit ramdisk mount.

implicit (os-managed cache):

  • reads and writes are often served from ram first
  • dirty pages are often flushed later (writeback, journal checkpoints, fsync), especially for longer-lived files
  • eviction is global and workload-agnostic under memory pressure
  • files deleted before writeback may avoid full data flush, but metadata/journal traffic still tends to persist
  • sustained high-churn paths can still produce substantial long-term TBW consumption

explicit (tmpfs or mounted ramdisk):

  • writes to that mount are memory-backed by design
  • per-path control is deterministic (TMPDIR, CARGO_TARGET_DIR, SCCACHE_DIR)
  • easy per-agent isolation (/mnt/ramdisk/agent-<id>)
  • hard quotas via per-agent mounts (tmpfs -o size=...) or cgroup/systemd memory limits
  • predictable cleanup by unmount/delete

why explicit ramdisk with sccache:

  • page cache can mask latency, but it does not guarantee write avoidance on persistent filesystems
  • the biggest write sink is often build output materialization (CARGO_TARGET_DIR) even on cache hits
  • with many continuous agents, local sccache metadata/object churn can still create steady writeback pressure
  • setting SCCACHE_DIR to explicit ramdisk makes local cache-write avoidance deterministic for hot entries

practical guidance:

  • if reboot persistence matters most, keep SCCACHE_DIR on ssd (or remote backend), and put build/test scratch on ramdisk
  • if minimizing local ssd wear matters most, place SCCACHE_DIR on ramdisk with a bounded size (SCCACHE_CACHE_SIZE) and accept cache loss on reboot
  • best of both: keep a hot local ramdisk tier and use a remote sccache backend for durability/sharing
graph TB;
    subgraph p[persistent ssd tier]
      s1["source repos"];
      s2["dependency cache kept across reboots"];
      s3["final artifacts"];
    end

    subgraph r[ramdisk tmpfs tier]
      r1["build scratch"];
      r2["test temp and coverage scratch"];
      r3["ml intermediates and throwaway checkpoints"];
      r4["agent temp and logs"];
    end

    a1["agent 1"] --> r1;
    a2["agent 2"] --> r2;
    r1 --> s3;
    r2 --> s3;
Loading

practical storage tiering

keep persistent/reproducible state on ssd:

  • source repos
  • dependency caches you want to keep
  • final artifacts

move high-churn ephemeral paths to ramdisk:

  • build scratch (target, dist, temp object dirs)
  • test temp + coverage scratch
  • ml validation intermediates
  • agent-local temp/log workdirs

minimal setup pattern:

AGENT_ID=${AGENT_ID:-0}

sudo mkdir -p /mnt/ramdisk
sudo mount -t tmpfs -o size=24G,mode=1777 tmpfs /mnt/ramdisk

mkdir -p /mnt/ramdisk/agent-$AGENT_ID/{tmp,target,sccache}
export TMPDIR=/mnt/ramdisk/agent-$AGENT_ID/tmp
export CARGO_TARGET_DIR=/mnt/ramdisk/agent-$AGENT_ID/target

# optional:
export SCCACHE_DIR=/mnt/ramdisk/agent-$AGENT_ID/sccache
# export SCCACHE_CACHE_SIZE=10G

for multiple agents, set a unique AGENT_ID per worker to prevent cross-agent contention.

note: some systems already mount /tmp as tmpfs. a dedicated mount is still useful for deterministic sizing and per-agent isolation.

verify with real counters

validate the model with host-write counters during a normal agent run:

  • nvme: compare data_units_written before/after (nvme smart-log)
  • smart: compare host-write attributes before/after (smartctl -A)

using counter deltas helps calibrate host_overhead_factor for your actual workload.

rule of thumb

before increasing local agent count, compute projected writes and compare with a 10-year daily budget:

daily_budget_GB ~= (drive_TBW * 1000) / 3650

if projected writes are already a large fraction of that budget, add ramdisk first and then scale agents.

in agent-native development, horizontal scaling without storage tiering is a hidden reliability bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment