A phase-gated bootstrap guide for running large models across two Apple Silicon nodes using EXO. Battle-tested on Mac Studio M4 Max (128 GB unified) pairs over Thunderbolt 5.
Reference performance (Qwen3.5-122B-A10B-4bit):
- ~52 tok/s sustained at 512–1024 token range
- Stable at concurrency=2 (p95 ~10.37s)