Skip to content

Instantly share code, notes, and snippets.

@willkelly
Created December 4, 2025 19:50
Show Gist options
  • Select an option

  • Save willkelly/f4d3a5b26c63f3346c4dcf8109d55696 to your computer and use it in GitHub Desktop.

Select an option

Save willkelly/f4d3a5b26c63f3346c4dcf8109d55696 to your computer and use it in GitHub Desktop.
traverseDAG performance fix - diamond problem solution with WeakMap memoization

traverseDAG Performance Fix

Problem

CPU pegging at 100% when displaying many charms (7+) due to excessive traverseDAG calls. The system was making 18,623 traversal calls to traverse only 338 unique objects - a 55x repetition factor.

Root Cause: The Diamond Problem

CharmA ──→ SharedDocX
     │
CharmB ──→ SharedDocX  (traversed AGAIN!)
     │
CharmC ──→ SharedDocX  (traversed AGAIN!)

When multiple charms link to shared documents, the old CompoundCycleTracker used a stack-based approach with using statements:

// OLD: Stack-based tracking (cleared on scope exit)
using t = tracker.include(doc.value, schemaContext);
// Symbol.dispose removes entry when scope exits

This prevented cycles (A→B→A) but NOT re-traversals via different paths. Each charm independently traversed the same shared documents.

Solution

Two-Tier Tracking Strategy

Replace stack-based cycle detection with persistent memoization:

  1. WeakMap for cycle detection (inProgress) - tracks objects currently being traversed
  2. WeakMap for result caching (completed) - stores result references (not copies)
  3. Schema key deduplication - caches JSON.stringify results to reduce string memory
export class CompoundCycleTracker<PartialKey, ExtraKey> {
  private inProgress = new WeakMap<object, Set<string>>();  // Cycle detection
  private completed = new WeakMap<object, Map<string, unknown>>();  // Memoization
  private schemaKeyCache = new WeakMap<object, string>();  // Dedup schema keys

  include(partialKey, extraKey, context): DisposableWithCache | null {
    const obj = partialKey as object;
    const schemaKey = this.getSchemaKey(extraKey);

    // Return cached result if available
    if (this.completed.get(obj)?.has(schemaKey)) {
      return { cached: this.completed.get(obj)!.get(schemaKey), [Symbol.dispose]: () => {} };
    }

    // Detect cycles
    if (this.inProgress.get(obj)?.has(schemaKey)) {
      return null;  // Cycle detected
    }

    // First visit - mark in-progress
    this.inProgress.get(obj)!.add(schemaKey);
    return {
      setCachedResult: (result) => this.completed.get(obj)!.set(schemaKey, result),
      [Symbol.dispose]: () => this.inProgress.get(obj)?.delete(schemaKey)
    };
  }
}

Key Design Decisions

Why WeakMap?

  • Automatic garbage collection when objects are no longer referenced
  • Uses object identity rather than serialization
  • No manual cache cleanup needed

Why cache result references, not copies?

  • Memory efficient - shares underlying objects
  • Traversal produces immutable results, safe to reuse references
  • Side effects (document loading, schema tracking) are idempotent

How does cycle detection still work?

  • inProgress tracks objects currently in the call stack
  • Still detects true cycles (A→B→A)
  • But allows revisiting completed nodes via different paths

How are $alias vs cross-document links handled?

  • $alias (intra-document): Same document, different path - always re-traverse (path-dependent)
  • Cross-document links: Different document ID - use document-level caching with includeDocument()
// In traverseDAG - check if we crossed document boundaries
const crossedDocBoundary = newDoc.address.id !== doc.address.id;
if (crossedDocBoundary) {
  // Use document-level tracking to avoid diamond problem
  const tracker_result = tracker.includeDocument(
    newDoc.address.id,
    newDoc.address.path,
    SchemaAll
  );
  if (tracker_result === null) return null; // Already visited
  const result = this.traverseDAG(newDoc, tracker, schemaTracker);
  tracker_result.setCachedResult(result);
  return result;
}
// Intra-document reference ($alias) - always re-traverse
return this.traverseDAG(newDoc, tracker, schemaTracker);

Results

Metric Before After Improvement
traverseDAG calls 18,623 700 26.6x fewer
Profile file size 21 MB 8.2 MB 2.6x smaller
String memory (JSON.stringify) High duplication Deduplicated 38.5% reduction
Repetition factor ~55x ~2x Much less redundant work

The fix eliminates redundant re-traversals of shared documents while maintaining correct cycle detection and handling both intra-document ($alias) and cross-document references correctly.

Files Modified

  • /home/wkelly/src/commontoolsinc/labs/packages/runner/src/traverse.ts
    • CompoundCycleTracker class: Replaced Map-based stack with WeakMap-based memoization
    • traverseDAG method: Added caching for arrays/objects and cross-document tracking
    • SchemaObjectTraverser.traversePointerWithSchema: Added cross-document tracking
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment