Date: 2026-03-08 Model: google/gemma-3-1b-it (262,144 vocab) Eval dataset: wikitext-2-raw-v1 validation (254,828 positions) Code: voltropy/shortlist@8168cac
We discovered that a static, frequency-ranked token set with a simple margin-based fallback to full-vocab scoring achieves better parity than a trained neural router, with zero parameters, zero training, and zero inference-time routing.