johnsoncodehk/alien_signals_deep_dive.md

## alien_signals_deep_dive.md

      
    Raw
  

              alien_signals_deep_dive.md
            
          
    These are some notes on the performance work that went into alien-signals. I'm sharing them not as a definitive guide, but as a log of a few key discoveries. The hope is that some of these findings might be useful to others tackling similar problems in high-performance JavaScript.
The Origin: Push-Pull-Push

My journey into the depths of reactivity performance began with Vue. I was trying to solve a specific problem in Vue 3.4: even if a computed's value didn't change, it would still trigger downstream computations and effects. This seemed inefficient. My attempt to fix this resulted in a pull request (vuejs/core#5912) that, after a year of discussions, was eventually merged. This PR introduced the Push-Pull-Push model to Vue 3.4, a model also adopted by libraries like reactivity.
For a time, I thought this was near-perfect. Then I saw the plans for Vue 3.5, which adopted a doubly-linked list but also moved to a pure pull-based model. I was still convinced that Push-Pull-Push had untapped potential, so I started alien-signals as a playground to continue exploring this model's limits.
Why stick with Push-Pull-Push when Vue moved on? Pure pull models often rely on a global version counter to detect changes. This creates a hidden performance trap:

An unrelated signal update can increment the global version, forcing every subsequent read in the application to perform a deep, expensive check up its dependency chain. Push-Pull-Push avoids this "one-bad-apple" problem by propagating dirty states precisely along the dependency graph, without relying on global versions.
What follows are some of the key lessons learned along the way.
On Data Structures

Initially, I used Set to manage subscribers for each signal. It's the obvious, idiomatic choice in modern JavaScript. The code was clean and easy to understand. I was stuck in this mindset until I saw Vue 3.5's plans to adopt a doubly-linked list for its dependency graph.
Wait, we can do that?
This approach was pioneered by the Preact team 1. The key insight is that each dependency-subscriber relationship can be represented by a single Link node that lives in two linked lists simultaneously: one for the signal's list of subscribers, and one for the subscriber's list of dependencies. This "four-way linked" structure means:

Adding a subscription is O(1)—just append to the tail.
Removing a subscription is O(1)—just unlink the node, no searching required.
No hashing, no memory allocation for internal buckets, no GC pressure from discarded Set iterators.

Once I adopted this structure, the performance gains were substantial.
On Writing JIT-Friendly Code

Another area of investigation was functions that were performant but inconsistent. This led me to a deeper dive into how V8's JIT compiler, TurboFan, optimizes code. The most impactful concept I learned was monomorphism, which Vyacheslav Egorov explains brilliantly in his post, "What's up with monomorphism?" 2.
The core idea is that an operation is "monomorphic" if the objects it operates on always have the same internal "shape" or Hidden Class. The JIT can heavily optimize these monomorphic operations. As soon as an operation starts seeing objects with different shapes (polymorphic), it has to de-optimize and use a slower, more generic implementation.
A function in alien-signals for linking a dependency to a subscriber was a prime example of a polymorphic operation. It had a single if/else block to handle two different scenarios: creating a new link or updating an existing one. These two paths involved objects and operations of different shapes.
To fix this, I split the single function into two: one that only creates new links, and one that only updates existing ones. The calling code now performs a simple check to decide which function to invoke. This ensures that each function is now monomorphic. The JIT can trust the shape of the objects it will receive and apply its most aggressive optimizations.
Another V8-related constraint I ran into was the limit on in-object properties. V8 stores the first several properties of an object directly inside the object itself ("in-object"), which allows for very fast access. Properties beyond this limit spill over into a separate backing store, which is slower. According to V8's slack tracking mechanism 3, the engine pre-allocates slots based on the number of properties detected at compile time, plus an additional 8 slots as a buffer.
In the early days of alien-signals, my mental model of the reactivity system wasn't fully simplified yet. The classes had more properties than they needed—some for edge cases I later realized didn't require dedicated fields, others from abstractions that turned out to be unnecessary. I found myself constantly fighting to keep the property count under the threshold for fast in-object access.
Over time, as I understood the problem better, the model got simpler. Properties were merged or removed. What started as a constraint imposed by the engine eventually stopped being a concern—not because I found a workaround, but because the cleaner design simply had fewer properties. It was a good reminder that sometimes the best optimization is simplification.
To illustrate, here is a comparison of the Subscriber class from an early, Map-based implementation, and the final ReactiveNode interface:
Early Subscriber Class (pre-linked list):
class Subscriber {
    dirtyLevel = DirtyLevels.Dirty;
    version = 0;
    running = 0;
    depsLength = 0;
    deps: Subscribers[] = [];
    depsDirty = false;
    constructor(
        public subscribers: Subscribers | undefined,
        public effect?: () => void,
    ) { }
    // ... getter for dirty
}
Current ReactiveNode Interface:
interface ReactiveNode {
    deps?: Link;
    depsTail?: Link;
    subs?: Link;
    subsTail?: Link;
    flags: ReactiveFlags;
}
The early Subscriber class had 8 properties. The current ReactiveNode has only 5. Properties like version, running, depsLength, and depsDirty were either merged into the flags bitmask or became redundant due to the simplified model. This evolution not only improved performance but also made the core logic cleaner and more elegant.
On Control Flow: The Agony and Ecstasy of Manual DFS

The propagation of changes in a reactive graph is a classic Depth-First Search (DFS). The textbook implementation is recursive, and for good reason: it's elegant and maps directly to the problem's structure. But in JavaScript, elegance comes with a price: the risk of a stack overflow for deep graphs, and the small but non-zero overhead of each function call.
For a system designed for extreme performance and stability, relying on the call stack was not an option. This led to one of the most challenging parts of the project: manually implementing an iterative DFS for both propagate and checkDirty. It took me two or three days of intense effort to get the logic right. The initial, correct implementation was a complex beast involving a nested while loop and a goto-like label statement to correctly simulate the recursive behavior of descending and ascending the graph.
Here's a look at an early version of the iterative propagate from v1.0.0:
// Early iterative propagate (v1.0.0)
propagate(link: Link): void {
	let targetFlag = SubscriberFlags.Dirty;
	let subs = link;
	let stack = 0;

	top: do {
		const sub = link.sub;
		const subFlags = sub.flags;

		if (
			(
				!(subFlags & (SubscriberFlags.Tracking | SubscriberFlags.Recursed | SubscriberFlags.Propagated))
				&& (sub.flags = subFlags | targetFlag | SubscriberFlags.Notified, true)
			)
			|| (
				(subFlags & SubscriberFlags.Recursed)
				&& !(subFlags & SubscriberFlags.Tracking)
				&& (sub.flags = (subFlags & ~SubscriberFlags.Recursed) | targetFlag | SubscriberFlags.Notified, true)
			)
			|| (
				!(subFlags & SubscriberFlags.Propagated)
				&& isValidLink(link, sub)
				&& (
					sub.flags = subFlags | SubscriberFlags.Recursed | targetFlag | SubscriberFlags.Notified,
					(sub as Dependency).subs !== undefined
				)
			)
		) {
			const subSubs = (sub as Dependency).subs;
			if (subSubs !== undefined) {
				if (subSubs.nextSub !== undefined) {
					subSubs.prevSub = subs;
					link = subs = subSubs;
					targetFlag = SubscriberFlags.PendingComputed;
					++stack;
				} else {
					link = subSubs;
					targetFlag = subFlags & SubscriberFlags.Effect
						? SubscriberFlags.PendingEffect
						: SubscriberFlags.PendingComputed;
				}
				continue;
			}
			if (subFlags & SubscriberFlags.Effect) {
				if (queuedEffectsTail !== undefined) {
					queuedEffectsTail.depsTail!.nextDep = sub.deps;
				} else {
					queuedEffects = sub;
				}
				queuedEffectsTail = sub;
			}
		} else if (!(subFlags & (SubscriberFlags.Tracking | targetFlag))) {
			sub.flags = subFlags | targetFlag | SubscriberFlags.Notified;
			if ((subFlags & (SubscriberFlags.Effect | SubscriberFlags.Notified)) === SubscriberFlags.Effect) {
				if (queuedEffectsTail !== undefined) {
					queuedEffectsTail.depsTail!.nextDep = sub.deps;
				} else {
					queuedEffects = sub;
				}
				queuedEffectsTail = sub;
			}
		} else if (
			!(subFlags & targetFlag)
			&& (subFlags & SubscriberFlags.Propagated)
			&& isValidLink(link, sub)
		) {
			sub.flags = subFlags | targetFlag;
		}

		if ((link = subs.nextSub!) !== undefined) {
			subs = link;
			targetFlag = stack
				? SubscriberFlags.PendingComputed
				: SubscriberFlags.Dirty;
			continue;
		}

		while (stack) {
			--stack;
			const dep = subs.dep;
			const depSubs = dep.subs!;
			subs = depSubs.prevSub!;
			depSubs.prevSub = undefined;
			if ((link = subs.nextSub!) !== undefined) {
				subs = link;
				targetFlag = stack
					? SubscriberFlags.PendingComputed
					: SubscriberFlags.Dirty;
				continue top;
			}
		}

		break;
	} while (true);
}
This early version was functional but complex. The logic for managing the stack depth, storing return paths in prevSub, and correctly restoring the propagation flag when ascending the graph was convoluted. Over time, through constant refinement, the logic was simplified dramatically. Here is the current implementation:
// Current iterative propagate
function propagate(link: Link): void {
	let next = link.nextSub;
	let stack: Stack<Link | undefined> | undefined;

	top: do {
		const sub = link.sub;
		let flags = sub.flags;

		if (!(flags & (ReactiveFlags.RecursedCheck | ReactiveFlags.Recursed | ReactiveFlags.Dirty | ReactiveFlags.Pending))) {
			sub.flags = flags | ReactiveFlags.Pending;
		} else if (!(flags & (ReactiveFlags.RecursedCheck | ReactiveFlags.Recursed))) {
			flags = ReactiveFlags.None;
		} else if (!(flags & ReactiveFlags.RecursedCheck)) {
			sub.flags = (flags & ~ReactiveFlags.Recursed) | ReactiveFlags.Pending;
		} else if (!(flags & (ReactiveFlags.Dirty | ReactiveFlags.Pending)) && isValidLink(link, sub)) {
			sub.flags = flags | (ReactiveFlags.Recursed | ReactiveFlags.Pending);
			flags &= ReactiveFlags.Mutable;
		} else {
			flags = ReactiveFlags.None;
		}

		if (flags & ReactiveFlags.Watching) {
			notify(sub);
		}

		if (flags & ReactiveFlags.Mutable) {
			const subSubs = sub.subs;
			if (subSubs !== undefined) {
				const nextSub = (link = subSubs).nextSub;
				if (nextSub !== undefined) {
					stack = { value: next, prev: stack };
					next = nextSub;
				}
				continue;
			}
		}

		if ((link = next!) !== undefined) {
			next = link.nextSub;
			continue;
		}

		while (stack !== undefined) {
			link = stack.value!;
			stack = stack.prev;
			if (link !== undefined) {
				next = link.nextSub;
				continue top;
			}
		}

		break;
	} while (true);
}
The journey from the complex initial version to the refined final one was a lesson in itself. But the real payoff of this arduous conversion to an iterative DFS was an unexpected optimization it unlocked.
In a recursive implementation, each function call is a black box. In an iterative implementation, the entire state of the traversal is explicit in the stack variable. This allowed me to inspect the graph's structure during traversal. I noticed that in many cases, a node has only one subscriber, forming a simple linear chain. In these cases, the manual stack operations (pushing the next node, then immediately popping it) were redundant.
I added a "fast path": if the current node (link) has no nextSub, it means we are in a linear part of the graph. Instead of pushing to the stack, the loop can just continue with the next node directly. This simple check, only possible because of the iterative approach, significantly sped up updates in common "chained computed" scenarios.

The diagram above illustrates the key insight. On the left ("Before"), propagate always used the stack to fall back to the previous node after visiting a subscriber—even in a simple linear chain ("Deep"). The purple arrows show these backward traversals. On the right ("After"), the "Broad" case with branches still requires stack operations to handle multiple subscribers. But in the "Deep" case—a linear chain where each node has only one subscriber—there are no backward arrows. The traversal simply follows the chain forward, skipping all stack operations entirely. This fast path is only possible with an iterative implementation where the traversal state is explicit and inspectable.
The Art of Re-subscription: Reusing the Dependency List

One of the most significant, yet subtle, optimizations in alien-signals lies in how it handles re-subscription. In many reactivity systems, when a computed or effect needs to re-execute, it completely discards its old list of dependencies and builds a new one from scratch. This approach is straightforward but leads to significant memory churn, creating and destroying numerous objects, which in turn puts pressure on the garbage collector.
I initially followed this conventional path. However, I realized that in most real-world applications, the dependency graph of a computed or effect remains remarkably stable between executions. A component usually depends on the same set of signals every time it renders. So, why discard and rebuild the dependency list if it's going to be nearly identical?
This led to a pivotal optimization: reusing the dependency linked list instead of recreating it.
Here’s how it works. When a computed or effect re-executes, it doesn't immediately clear its dependencies. Instead, it simply resets a pointer, depsTail, to undefined. This pointer tracks the last-used dependency in the list.
// in `updateComputed` and `run` (for effects)
function updateComputed(c: ComputedNode): boolean {
  // ...
  c.depsTail = undefined; // Reset the tail pointer
  c.flags = ReactiveFlags.Mutable | ReactiveFlags.RecursedCheck;
  const prevSub = setActiveSub(c);
  try {
    // ... re-run the getter ...
    return oldValue !== (c.value = c.getter(oldValue));
  } finally {
    activeSub = prevSub;
    c.flags &= ~ReactiveFlags.RecursedCheck;
    purgeDeps(c); // Clean up unused dependencies
  }
}
As the getter function runs and accesses signals, the link function is called for each dependency. This is where the magic happens. The link function checks if the next dependency in the old list is the same one it's currently trying to subscribe to.
// in `link` function (system.ts)
function link(dep: ReactiveNode, sub: ReactiveNode, version: number): void {
  const prevDep = sub.depsTail;
  // ...

  // Is the next dependency in the old list the same as the current one?
  const nextDep = prevDep !== undefined ? prevDep.nextDep : sub.deps;
  if (nextDep !== undefined && nextDep.dep === dep) {
    // YES! It's a match. Just update the version and move the tail pointer.
    nextDep.version = version;
    sub.depsTail = nextDep;
    return; // Fast path: No allocation needed!
  }

  // NO. It's a new or different dependency. Create a new link.
  const newLink =
    (sub.depsTail =
    dep.subsTail =
      {
        /* ... create new Link object ... */
      });
  // ... logic to insert the newLink into the doubly-linked list ...
}
If it's a match, alien-signals takes a fast path: it simply updates the version on the existing Link object and moves the depsTail pointer forward to this node. No new objects are created. No memory is allocated. The existing list structure is reused.
After the getter function completes, there might be leftover nodes at the end of the old dependency list that were not used in this execution (e.g., due to conditional logic like if (a()) { b() }). The purgeDeps function handles this cleanup.
// in `purgeDeps` function (index.ts)
function purgeDeps(sub: ReactiveNode) {
  const depsTail = sub.depsTail;
  // Start from the first unused dependency
  let dep = depsTail !== undefined ? depsTail.nextDep : sub.deps;
  while (dep !== undefined) {
    // Unlink and remove it
    dep = unlink(dep, sub);
  }
}
This strategy of reusing the dependency list is a cornerstone of alien-signals' performance. For stable dependency graphs, re-executions become incredibly cheap, approaching near-zero allocation. This not only reduces GC pressure but also contributes to keeping functions monomorphic by promoting object stability. It elegantly sidesteps the need for an object pool in most common scenarios, because the objects were never discarded in the first place.
"You can't improve what you can't measure."

A significant part of alien-signals' success can be attributed to the excellent community infrastructure around reactivity benchmarks, specifically Milo Mighdoll's js-reactivity-benchmark. This project provides a standardized, comprehensive suite of tests that covers a wide range of scenarios, from simple updates to complex, deep, and broad dependency graphs.
It allowed me to get immediate, quantitative feedback on every change. An optimization that looked good in theory could be instantly validated (or invalidated) against dozens of different use cases. It turns a vague goal like "make it faster" into a concrete set of numbers to improve. Without this solid foundation for measurement, the process would have been a series of shots in the dark.
Looking Back

When I started this project, I thought I understood reactive systems pretty well after working on Vue's reactivity for years. I was wrong. The process of building alien-signals from scratch, with performance as the primary constraint, forced me to question every assumption.
Some lessons were technical: linked lists can beat hash maps; the JIT compiler rewards predictable code; an iterative DFS opens doors that recursion keeps closed. But the deeper lesson was about simplification. Many of the biggest performance wins came not from adding clever optimizations, but from refining the core abstractions. The final ReactiveNode interface with its 5 properties is not just faster than the early 8-property Subscriber class—it's a more distilled, more accurate model of what a reactive node actually needs to be.
There are still ideas I haven't been able to realize. An "end-to-end" propagation model, where effects subscribe directly to all their upstream signals, could theoretically eliminate the multi-level traversal entirely. But the memory cost of flattening the subscription graph makes it impractical. Sometimes the best optimization is knowing when to stop.
I hope these notes are useful to anyone working on similar problems. The code is open source at github.com/stackblitz/alien-signals.
No results found