Generated: January 26, 2026
Models: V2 (fixed training: BatchNorm, Dropout, proper 'both' spread)
Universe: S&P 500 stock pairs
V2 models fix critical bugs in the original training code:
| Issue | V1 (Broken) | V2 (Fixed) |
|---|---|---|
| Double-softmax | Loss stuck at 0.3133, no learning | Loss decreases, models learn |
| Regularization | None (instant overfit) | BatchNorm + Dropout(0.3) |
| 'both' spread | Used only log features (90) | Uses both features (180) |
| Probability calibration | 0.99-1.0 (overconfident) | 0.50-0.90 (realistic) |
Key change: V2 probabilities are properly calibrated. A 0.75 probability now means something real, unlike V1's meaningless 0.99.
These pairs appear in 4+ models with p>0.70 and cointegration confirmed:
| Pair | Models | Avg Prob | Z-Score | Action |
|---|---|---|---|---|
| CPB/GIS | 4 | 0.772 | -1.36 | MONITOR (wait for z < -2) |
| AVB/EQR | 4 | 0.775 | -2.24 | BUY SPREAD (Long AVB, Short EQR) |
| Pair | Model | Prob | Z-Score | Trade |
|---|---|---|---|---|
| AVB/EQR | both_100ep | 0.774 | -2.24 | Long AVB, Short EQR |
| EQR/UDR | coint_25ep | 0.758 | -3.18 | Long EQR, Short UDR |
| GEHC/RVTY | coint_25ep | 0.787 | -2.89 | Long GEHC, Short RVTY |
| HBAN/MTB | both_50ep | 0.731 | -3.34 | Long HBAN, Short MTB |
| ETN/GNRC | coint_25ep | 0.704 | -2.08 | Long ETN, Short GNRC |
| ALL/TRV | both_100ep | 0.735 | -1.68 | Long ALL, Short TRV |
| Pair | Model | Prob | Z-Score | Trade |
|---|---|---|---|---|
| MCHP/NXPI | both_50ep | 0.745 | +2.04 | Short MCHP, Long NXPI |
| PPL/WEC | both_50ep | 0.717 | +1.97 | Short PPL, Long WEC |
| DG/DLTR | both_50ep | 0.729 | +1.69 | Short DG, Long DLTR |
| Model | Cointegrated Pairs | Prob Range | Best For |
|---|---|---|---|
| both_100ep_v2 | 9 | 0.71-0.82 | Most signals, balanced |
| both_50ep_v2 | 7 | 0.72-0.83 | Highest individual probs |
| log_50ep_v2 | 6 | 0.71-0.78 | Traditional pairs |
| log_75ep_v2 | 4 | 0.71-0.79 | Conservative |
| coint_25ep_v2 | 3 | 0.70-0.79 | Extreme z-scores |
| coint_10ep_v2 | 3 | 0.71-0.74 | Quick validation |
Recommendation: Use both_100ep_v2 as primary model - most cointegrated pairs detected.
V1 Models (BROKEN):
Probabilities: 0.99-1.0 for everything
Interpretation: Model not learning, outputting constant predictions
V2 Models (FIXED):
both_100ep_v2: 0.50-0.87, mean=0.67
both_50ep_v2: 0.50-0.90, mean=0.66
log_75ep_v2: 0.50-0.92, mean=0.65
log_50ep_v2: 0.50-0.88, mean=0.63
coint_25ep_v2: 0.50-0.90, mean=0.62
coint_10ep_v2: 0.50-0.81, mean=0.59
V2 probabilities are properly calibrated - use p>0.70 threshold (not p>0.85).
| Column | What It Means | How to Use It |
|---|---|---|
| ticker_a / ticker_b | The two stocks in the pair | Trade both simultaneously |
| spread_zscore | How far apart (standard deviations) | < -2: A is cheap, > +2: A is expensive |
| compression_probability | ML confidence spread will narrow | > 0.70 = actionable signal (V2) |
| is_cointegrated | Statistical confirmation they move together | True = much safer to trade |
| hedge_ratio | Shares of B per share of A | Use for position sizing |
Z-Score Interpretation Action
─────────────────────────────────────────────────
-3.0 Extremely oversold STRONG BUY spread
-2.0 Significantly oversold BUY spread
-1.5 Entry threshold Consider entry
0.0 Normal EXIT position
+1.5 Entry threshold Consider entry
+2.0 Significantly overbought SELL spread
+3.0 Extremely overbought STRONG SELL spread
Current Signal:
- Z-Score: -2.24 (AVB undervalued vs EQR)
- Probability: 0.775 (77.5% confidence spread compresses)
- Cointegrated: Yes (statistically confirmed relationship)
- Hedge Ratio: 0.99
Trade Setup ($10k per leg):
LONG: AVB - Buy $10,000 worth
SHORT: EQR - Short $10,000 worth (hedge_ratio ≈ 1.0)
Entry: Z-score = -2.24
Exit: Z-score crosses 0 (spread normalized)
Stop: Z-score < -4.0 (relationship may be broken)
Expected Outcome: When z-score returns to 0, both positions profit from convergence.
-
is_cointegrated = True -
compression_probability > 0.70(V2 threshold) -
|spread_zscore| > 1.5(meaningful divergence) - Sector makes sense (same industry preferred)
- Profit exit: Z-score crosses 0 (spread normalized)
- Time exit: 20 trading days max hold
- Stop loss: |Z-score| > 4.0 (relationship breakdown)
- Max 5% of portfolio per pair
- Use hedge_ratio for dollar-neutral positioning
- Never more than 3 pairs from same sector
KatanaEquities is built on KatanaDNN, a proprietary 5-layer neural network originally developed for fixed income markets.
Background:
- Originated from Katana Labs (ING spinout, 2019)
- Analyzed 200M+ bond pairs with 91% accuracy
- Integrated on Bloomberg Terminal (325K+ users)
- IP acquired in 2022, adapted for equity markets
Architecture: 90→90→90→90→90→2 (or 180→90→... for 'both' spread type)
You can't vibe code a 5-layer DNN - this represents production-grade ML infrastructure, not weekend prompt engineering.
- Pairs Trading Parameter Optimization (arXiv) - Optimal entry 1.42, exit 0.37
- Cointegration vs Distance Methods - Cointegration provides stable returns
- Neural Networks for Pairs Trading
Generated by KatanaEquities V2 - Properly trained models with BatchNorm + Dropout