Last updated: 2026-03-02
Written by Claude, analysis and optimization strategy by Gemini Deep Think, implementation and modification by Claude
Three plans informed the fix for performance regressions on the perf-optimization branch. Gemini Deep Think v1 made the original optimizations -- batch search improved massively (60-64% Rust, 5.7x Python), but single-query APX regressed up to 383% and some exact cases regressed up to 58%. Gemini v2 was a revised plan after the regressions surfaced. Claude independently diagnosed the root causes and produced the final 6-step implementation plan.
The core story: Gemini Deep Think had the high-level strategy basically right for batch, but missed that its own Change 4 ("Remove ndarray allocations from ANN reranking") caused the APX problem. It didn't account for blas-accelerate being enabled in Python builds, which means ndarray::dot() dispatches to Apple Accelerate's AMX coprocessor -- not generic Rust matrixmultiply. Removing
