Date: January 6, 2025
Worker: clusterduck-worker-spot-stage-whz7 (c3-standard-22)
Region: us-west1-b
GCS Bucket: clusterduck-zfs-snapshots-stage (same region)
ZFS Pool: spaces (encrypted AES-256-GCM, compression=zstd-3)
ZFS send has two modes:
- Standard send: Decompresses data before sending, recompresses on receive
- Raw send (
-w): Sends compressed+encrypted blocks directly
Always use -w flag for encrypted/compressed pools!
| Logical Size | lz4 Transfer | zstd-3 Transfer | Improvement |
|---|---|---|---|
| 100 MB | 19 MB | 6.6 MB | 2.9x smaller |
| 500 MB | 95 MB | 33 MB | 2.9x smaller |
| 2 GB | 380 MB | 131 MB | 2.9x smaller |
| 10 GB | 1.9 GB | 657 MB | 2.9x smaller |
zstd-3 achieves ~15x compression on JSON data (vs ~5x for lz4).
| Scenario | Base Size | Incrementals | Compression | Actual Transfer |
|---|---|---|---|---|
| small | 100 MB | 5 × 1 MB appends | ~15x | 6.6 MB base, 70 KB/incr |
| medium | 500 MB | 5 × 5 MB appends | ~15x | 33 MB base, 340 KB/incr |
| large | 2 GB | 5 × 10 MB appends | ~15x | 131 MB base, 670 KB/incr |
| xlarge | 10 GB | 5 × 50 MB appends | ~15x | 657 MB base, 3.3 MB/incr |
| Logical Size | Compressed Size | Send Time | Recv Time | Send MB/s | Recv MB/s |
|---|---|---|---|---|---|
| 100 MB | 6.6 MB | 269 ms | 203 ms | 39 | 33 |
| 500 MB | 33 MB | 400 ms | 534 ms | 83 | 62 |
| 2 GB | 131 MB | 1.2 sec | 1.8 sec | 110 | 71 |
| 10 GB | 657 MB | 6.0 sec | 8.5 sec | 111 | 77 |
| Logical Append | Compressed Size | Send Time | Recv Time |
|---|---|---|---|
| 1 MB | 70 KB | 55 ms | 93 ms |
| 5 MB | 340 KB | 55 ms | 97 ms |
| 10 MB | 670 KB | 60 ms | 97 ms |
| 50 MB | 3.3 MB | 103 ms | 127 ms |
For a complete restore (base + 5 incrementals):
| Space Size (Logical) | Base Recv | 5 Incrementals | Total Restore |
|---|---|---|---|
| 100 MB | 0.2 sec | 0.5 sec | < 1 sec |
| 500 MB | 0.5 sec | 0.5 sec | ~1 sec |
| 2 GB | 1.8 sec | 0.5 sec | ~2.3 sec |
| 10 GB | 8.5 sec | 0.6 sec | ~9 sec |
| 50 GB (estimated) | ~43 sec | ~3 sec | ~46 sec |
With zstd-3, JSON data compresses ~15x:
- 10GB logical = 657MB actual transfer
- This is 2.9x better than lz4
With append-style writes (like SQLite WAL):
- 1MB logical append = 70KB transfer = 93ms restore
- 50MB logical append = 3.3MB transfer = 127ms restore
Even tiny incrementals take 50-90ms due to:
- GCS API round-trip (~30-50ms)
- ZFS snapshot overhead (~10-20ms)
- Process spawn overhead
- Send: 80-110 MB/s of compressed data
- Recv: 60-77 MB/s of compressed data (disk write bound)
The benchmark generates realistic JSON data to simulate SQLite/document workloads:
{"id":1736198400001,"ts":1736198400001,"uid":"u1234","sid":"s56789","act":"click","pg":"/dashboard","dur":1500,"ok":true,"tg":["web","mobile"],"ref":"r123","v":"1.0"}Each record contains:
- Timestamps and sequential IDs (high entropy)
- User/session identifiers (medium entropy, repeated patterns)
- Action types and page paths (low entropy, from fixed sets)
- Boolean flags and arrays (mixed)
This produces ~15x compression with zstd-3, which is better than typical production workloads (expect 5-10x for real SQLite data).
Incrementals append data to the file (like SQLite WAL writes) rather than overwriting. This means ZFS only needs to transfer the new blocks, resulting in small incremental snapshots proportional to the append size.
All benchmarks use zfs send -w (raw mode) which:
- Preserves on-disk compression during transfer
- Preserves encryption (no key needed on receiving side until mount)
- Transfers actual block data, not logical data
- Use zstd-3 compression: ~3x better than lz4 for JSON/SQLite data
- Use raw sends (
-w): Required for compressed/encrypted pools - Frequent small incrementals: 1-10MB appends restore in <100ms
- Pre-warm for large spaces: 10GB+ takes ~10 seconds to restore
We previously benchmarked Litestream GCS recovery on similar hardware. Here's how they compare:
| Database Size | Litestream (c3-88, Hyperdisk) | ZFS (c3-22, Local NVMe) | ZFS Advantage |
|---|---|---|---|
| 100 MB | 1.4 sec | 0.2 sec | 7x faster |
| 500 MB | 5.6 sec | 0.5 sec | 11x faster |
| 1 GB | 11.0 sec | ~1.0 sec (est) | 11x faster |
| 2 GB | 21.6 sec | 1.8 sec | 12x faster |
| 5 GB | 51.6 sec | ~4.5 sec (est) | 11x faster |
| 10 GB | ~103 sec (est) | 8.5 sec | 12x faster |
| Metric | Litestream | ZFS (zstd-3) |
|---|---|---|
| Peak throughput | 90-99 MB/s | 77-111 MB/s |
| Effective throughput (logical data) | 90-99 MB/s | ~1,150 MB/s* |
*ZFS transfers compressed data, so 77 MB/s of compressed data = ~1,150 MB/s of logical data at 15x compression.
-
Compression in transit: ZFS raw send (
-w) transfers compressed blocks. With zstd-3, 10GB logical = 657MB actual transfer. Litestream transfers uncompressed SQLite data. -
Block-level deduplication: ZFS incrementals only transfer changed blocks. Litestream WAL segments contain logical operations that must be replayed.
-
No replay overhead: ZFS receive writes blocks directly to disk. Litestream must decompress and replay WAL operations against SQLite.
-
Native encryption: ZFS encrypted blocks transfer as-is. No decrypt/re-encrypt overhead.
- Simpler infrastructure: No ZFS pool management, works with any filesystem
- Point-in-time recovery: Can restore to any WAL position, not just snapshots
- Smaller storage footprint: WAL compression can be more efficient for certain workloads
- Cross-platform: Works anywhere SQLite runs
- Different instance size: ZFS tested on c3-standard-22; Litestream best results on c3-standard-88
- Same disk type: Both tested on pd-balanced (ZFS via NVMe interface)
- Different test data: ZFS used synthetic JSON; Litestream used real SQLite databases
- Hyperdisk potential: Litestream saw 35-40% improvement with Hyperdisk; ZFS would likely see similar gains
For ClusterDuck's use case (fast failover, encrypted data at rest):
- ZFS wins decisively on restore speed due to compressed/encrypted block transfer
- 10GB space restores in ~9 seconds vs ~100+ seconds with Litestream
- Instance: c3-standard-22 (22 vCPUs, 88GB RAM, 23 Gbps network)
- Storage: pd-balanced 500GB (via NVMe interface)
- Encryption: AES-256-GCM (native ZFS encryption)
- Compression: zstd-3
- GCS: Standard storage, same region (us-west1)