yongkangc/execution-cache-guard-review.md

## execution-cache-guard-review.md

      
    Raw
  

              execution-cache-guard-review.md
            
          
    Execution Cache Usage Guard Review (mattsse comment)

Context

The PR introduces SavedCache::usage_guard, an Arc<()> that is meant to ensure the cached execution state for a parent block cannot be reused while prewarm tasks are still mutating it. ExecutionCache::get_cache_for only returns a cache when SavedCache::is_available() sees the guard's strong count at 1.
Why the guard never trips today


PayloadProcessor::spawn_caching_with calls self.cache_for(env.parent_hash) to fetch the cached state.
Immediately after, .split() consumes the SavedCache and returns (ExecutionCache, CachedStateMetrics) so we can wire them into the PrewarmContext.
Dropping that SavedCache clone releases the usage_guard Arc, so the strong count on the shared instance in ExecutionCache.inner falls back to 1.
The prewarm task only keeps the raw ExecutionCache and metrics clones. Those clones share the underlying moka caches, but they do not hold on to the guard anymore.
Any concurrent caller can now invoke execution_cache.get_cache_for(parent_hash) and is_available() will happily return true, because the only live guard is the one in ExecutionCache.inner itself.

This is exactly what mattsse pointed out by linking:

payload_processor/mod.rs:317 – where we call .split() and drop the guard.
payload_processor/prewarm.rs:249 – where the PrewarmContext stores only the cloned caches/metrics.
payload_processor/prewarm.rs:158-162 – where save_cache later updates the shared cache, but by then the guard has been long gone.

Consequences


Two prewarm tasks can operate concurrently on the same cached state if another payload for the same parent hash sneaks in. The guard was supposed to prevent that, so we currently have no protection against cache corruption.
Because the guard is public (pub(crate) usage_guard), even if we fixed the lifetime issue, other modules could still clone/drop it, making reasoning about exclusivity harder.

What needs to change


Hold on to the guard for as long as prewarm execution uses the cache. The easiest fix is to store either the SavedCache itself or the guard Arc inside PrewarmContext, which lives until save_cache runs.
Make the guard inaccessible from outside the cached-state module and expose a dedicated API (e.g. SavedCache::lease()) so future call sites cannot accidentally drop it too early.

Once that is in place, SavedCache::is_available() will only return true after prewarm is completely done and has either updated or discarded that cached state.
No results found