You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FCU is very fast — negligible compared to newPayload:
Metric
Value
Mean
0.54ms
Median
0.53ms
Max
0.86ms
Key Observations
1. Execution dominates at 68% of NP time
On these ~10M gas blocks, EVM execution takes 68.4% of newPayload time while state root takes 31.6%. This is different from the pattern we saw on very small blocks (<5M gas) where state root dominated.
2. State root is well-optimized for this workload
P50 state root of 6.85ms is good for blocks touching ~200 accounts / ~240 storage slots
The parallel state root task (reth_trie_state_root_task) is working effectively
Proof task worker idle times (P50: 2ms) suggest good parallelism utilization
3. P99 tail latency driven by outlier blocks
P99 NP latency is 92ms vs P50 of 21ms — a 4.4x ratio
The max (205ms) suggests occasional blocks with unusual state access patterns
These outliers are worth investigating individually
4. Transaction scheduling overhead is minimal
Tx wait time (0.064s total) is <1% of execution — the processor spawn is not a bottleneck
Bottleneck Analysis
For ~10M gas blocks on mainnet (current state on main):
NP latency breakdown (P50 = 20.84ms):
┌───────────────────────────────────────────┐
│ EVM Execution │ 10.32ms │ 49.5% │
│ State Root │ 6.85ms │ 32.9% │
│ Overhead/scheduling │ 3.67ms │ 17.6% │
└───────────────────────────────────────────┘
At P95 (37.78ms):
┌───────────────────────────────────────────┐
│ EVM Execution │ 23.55ms │ 62.3% │
│ State Root │ 11.95ms │ 31.6% │
│ Overhead/scheduling │ 2.28ms │ 6.0% │
└───────────────────────────────────────────┘
Recommendations
High Impact
EVM execution optimization — At 68% of time, this is the biggest lever. Profile individual opcode hotspots in the EVM execution path. The 0.108ms/tx average suggests room for improvement in state access patterns during execution.
Investigate P99 outliers — The 92ms P99 and 206ms max are 4-10x above median. Understanding what makes these blocks slow could help improve tail latency.
Medium Impact
State root parallelism — While P50 is good at 6.85ms, the proof task worker idle times (P90: 11.2ms, P99: 28.8ms) suggest some work distribution imbalance. The storage worker idle time (94s total across all workers) indicates potential for better work balancing.
Reduce overhead gap — There's ~3.7ms of unaccounted time at P50 between execution+state_root and total NP latency. This could be FCU-related overhead, notification, or other post-execution work. ## v2 Results (with warmup — fair comparison)
Lower Priority for This Workload
Persistence — Not a bottleneck for ~10M gas blocks with --wait-for-persistence
CSV Data Methodology: Each chunk size gets a full warmup pass (345-block replay to warm OS page cache), then unwind + measured pass. This eliminates cold-cache bias from v1.
Results are saved at: Setup: 345 blocks (24463559–24463903), ~10M gas each, MDBX (non-edge), commit f5cf90227b, jemalloc+asm-keccak
Best: chunk=15 and chunk=30 are essentially tied at ~1.197 Ggas/s (+1.2% over default)
Worst: chunk=240 at 1.0647 Ggas/s (−10% vs default)
Default (60) is reasonable but 15–30 offer marginal improvement
Larger chunks hurt: 120 (−2.7%) and 240 (−10%) show clear degradation
v1 spread was mostly cold-cache artifact: v1 showed 2.4x spread (0.52–1.21), v2 shows only 12% spread (1.06–1.20)
Recommendation
The default of 60 is fine. Chunk sizes 15–30 offer ~1% improvement but the difference is within noise. Avoid chunk sizes ≥120 which show measurable degradation. The key finding is that cold OS page cache was the primary variable in v1, not chunk size.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters