zmanian/mosaik-mempool-analysis.md

## mosaik-mempool-analysis.md

      
    Raw
  

              mosaik-mempool-analysis.md
            
          
    Mosaik as a Framework for Typed Mempools with Integrated Order Matching

Exploring whether Mosaik -- Flashbots' self-organizing p2p runtime -- could serve as the foundation for a high-performance mempool (in the style of Commonware or CometBFT) that collapses transaction dissemination, order matching, and block production into a single pipeline.
The Traditional Mempool Pipeline

In CometBFT and most blockchain architectures, the pipeline is segmented:
Tx submission -> Mempool gossip (opaque bytes) -> Block proposer selects txs
    -> Consensus on block -> Execute (ABCI app) -> Commit

Key properties of this model:

Transactions are opaque byte blobs to the mempool and consensus layers
No type-aware routing -- every validator receives every transaction via flood gossip
Matching/auction logic lives behind the ABCI boundary (PrepareProposal/ProcessProposal in CometBFT v0.38+)
Strict layering -- consensus engine knows nothing about application semantics

Commonware -- an independent company founded by Patrick O'Grady (ex-Ava Labs, ex-Coinbase) -- takes a different approach with its "anti-framework" library of composable blockchain primitives. Commonware provides Simplex BFT consensus, authenticated p2p, and pluggable components, but like CometBFT the consensus layer still fundamentally orders messages that get passed to application logic for execution.
What Mosaik Does Differently

Mosaik's primitives suggest a fundamentally different architecture:
Typed Streams Replace Opaque Mempool Gossip

Instead of gossiping raw bytes, transactions are typed at the network layer:
#[derive(Serialize, Deserialize)]
struct SwapOrder { pair: TradingPair, side: Side, amount: u64, limit_price: u64 }

#[derive(Serialize, Deserialize)]
struct BridgeIntent { source_chain: ChainId, dest_chain: ChainId, asset: Asset, amount: u64 }

#[derive(Serialize, Deserialize)]
struct StakeTransaction { validator: PeerId, amount: u64 }
Each type gets its own stream ID (blake3 hash of the Rust type name). Nodes subscribe only to the transaction types they care about. A DEX matching engine doesn't need to see staking transactions. A bridge relayer doesn't need swap orders.
Selective Subscription vs. Flood Gossip

CometBFT mempool: every validator gets every transaction.
Mosaik streams: nodes subscribe with predicates:
// DEX matcher only consumes swap orders from authorized sources
let orders = network.streams()
    .consumer::<SwapOrder>()
    .subscribe_if(|peer| peer.tags().contains("orderflow-source"))
    .build();

// Bridge relayer only consumes bridge intents for its chain
let intents = network.streams()
    .consumer::<BridgeIntent>()
    .subscribe_if(|peer| peer.tags().contains("chain:near"))
    .build();
This is closer to how a real high-performance exchange works -- order routers direct flow to the right matching engine, not broadcast everything everywhere.
Dynamic Predicate Re-evaluation for Leader Routing

A critical feature for mempool use: when peer tags change (e.g., proposer rotation), subscribe_if predicates are automatically re-evaluated and stream connections re-route to newly matching peers. This enables direct-to-proposer transaction routing without application-level reconnection logic.
// Tx sources automatically follow the current proposer
let proposer_tag = Tag::from("proposer");
let consumer = network.streams()
    .consumer::<Transaction>()
    .subscribe_if(move |peer| peer.tags().contains(&proposer_tag))
    .build();
When proposer A loses the "proposer" tag and proposer B gains it, the stream consumer disconnects from A and connects to B automatically.
The Replicated State Machine IS the Matching Engine

In CometBFT, the consensus engine orders transactions, then hands them to the application via ABCI for execution. The matching/auction logic is behind a process boundary.
In Mosaik, the Group RSM can BE the matching engine:
struct OrderBook {
    bids: BTreeMap<Price, VecDeque<Order>>,
    asks: BTreeMap<Price, VecDeque<Order>>,
    fills: Vec<Fill>,
    block_height: u64,
}

impl StateMachine for OrderBook {
    type Command = OrderBookCommand; // PlaceOrder, CancelOrder, EndBlock
    type Query = OrderBookQuery;     // GetBBO, GetDepth, GetFills

    fn apply(&mut self, cmd: OrderBookCommand) {
        match cmd {
            PlaceOrder(order) => self.match_and_insert(order),
            CancelOrder(id) => self.remove(id),
            EndBlock => {
                self.block_height += 1;
                self.emit_fills(); // produces Stream<Fill>
            }
        }
    }
}
No ABCI boundary. No serialization/deserialization between consensus and application. The Raft log IS the order sequence. The state machine application IS the matching.
Implemented Architecture: Proposer-Aware Mempool

We built a concrete implementation that demonstrates three layers of this architecture.
Layer 1: Per-Validator Raft Clusters

Each "validator" is a 3-node Mosaik Raft cluster, not a single node:
Validator A Cluster:  [A0 leader] [A1 follower] [A2 follower]
Validator B Cluster:  [B0 leader] [B1 follower] [B2 follower]
Validator C Cluster:  [C0 leader] [C1 follower] [C2 follower]

Benefits:

Failover: If the cluster leader crashes, a follower takes over via Raft election. The validator never misses its proposer slot.
Load balancing: Followers serve read queries (CheckTx validation, pending count) at Consistency::Weak.
Stream stability: All cluster nodes share the same tags. If the leader fails, stream connections remain on surviving nodes -- zero reconnection delay.

This extends CometBFT's existing sentry node topology: sentries protect against DDoS, but if the validator process crashes, manual intervention is needed. Raft clusters automate failover.
Layer 2: Proposer-Aware Routing via Dynamic Tags

The deterministic proposer schedule (weighted round-robin, same algorithm as CometBFT) determines which cluster is the current proposer. All 3 nodes in the proposer cluster get the "proposer" tag. Tx source consumers use subscribe_if to route transactions directly to the proposer:
subscribe_if(|peer| peer.tags().contains(&Tag::from("proposer")))
When the proposer rotates, the cluster-aware tracker updates tags on the entire cluster, Mosaik re-evaluates stream predicates, and transaction flows automatically re-route.
Layer 3: Pipeline Pre-Connection for Fast BFT

With reactive tag updates alone, there is a 100-400ms delay between a proposer rotation and stream reconnection. In fast BFT protocols where proposers rotate every 500ms, this means stream connections may not be established until the proposer's slot is nearly over.
Solution: pre-tag upcoming proposer clusters using the deterministic schedule.


Tag
Meaning
Lead Time


proposer
Current slot's block builder
0ms


proposer-next
Next slot's proposer
500ms


proposer-soon
Slot+2 proposer
1000ms


Stream consumers use a compound predicate:
subscribe_if(|peer| {
    peer.tags().contains(&Tag::from("proposer"))
    || peer.tags().contains(&Tag::from("proposer-next"))
    || peer.tags().contains(&Tag::from("proposer-soon"))
})
By the time a validator's slot starts, stream connections have been warm for 1000ms. Combined with cluster-level tagging (all 3 nodes get the same tags), intra-cluster failover doesn't disrupt stream connections.
MempoolStateMachine

The mempool implements the core CometBFT ABCI transaction lifecycle as a Raft RSM:
struct MempoolStateMachine {
    pending: Vec<Transaction>,
    seen_ids: HashSet<u64>,
    nonce_tracker: HashMap<String, u64>,
    max_block_size: usize,
    max_block_gas: u64,
}
Commands: AddTransaction (with inline CheckTx), BuildBlock (PrepareProposal with gas-aware packing), ClearPending, RecheckPending (post-block nonce revalidation)
CometBFT Fork: ProposerOracle Interface

We implemented proposer-aware routing directly in a fork of CometBFT's Go codebase.
New: mempool/proposer_schedule.go -- ProposerOracle interface:
type ProposerOracle interface {
    ProposerAt(height int64, round int32) types.Address
    UpcomingProposers(height int64, n int) []types.Address
    IsUpcomingProposer(addr types.Address, lookahead int) bool
    CurrentHeight() int64
    CurrentRound() int32
}
Modified: mempool/reactor.go -- Soft priority routing in broadcastTxRoutine: non-proposer peers receive a 50ms delay per transaction. All peers still receive all transactions (safety preserved). Backward compatible: nil oracle = existing flood behavior.
Implementation: zmanian/cometbft@feature/proposer-aware-mempool (~200 lines across 4 files, all existing tests pass)
Architectural Comparison

CometBFT model:
  User -> Mempool (bytes) -> All validators -> Proposer -> PrepareProposal
       -> Consensus on block -> ProcessProposal -> ABCI App (matching here)
       -> Commit -> Events

Mosaik model:
  User -> Stream<Transaction> -> Proposer cluster only (via subscribe_if)
       -> Group RSM (matching + consensus unified)
       -> Stream<Block> -> Settlement consumers

What Gets Collapsed


Traditional Layer
Mosaik Equivalent
What Changes


Mempool gossip
Typed streams
Type-aware routing, selective subscription


Transaction selection (PrepareProposal)
RSM command intake
No separate proposal step -- commands flow directly into RSM


Block consensus
Raft group
Consensus on command ordering, not block blobs


ABCI execution
RSM apply()
No process boundary -- matching runs inline with consensus


Event emission
Output streams
Fills/settlements as typed streams, not opaque events


Concrete Improvements Over CometBFT


Aspect
CometBFT Status Quo
This Design


Tx routing
Flood gossip to all validators
Priority routing to proposer + next 2


Bandwidth
O(validators * txs)
O(3 * txs) for proposer tier


Proposer mempool freshness
Depends on gossip convergence
Direct delivery, always freshest


Validator failover
Manual restart
Automatic Raft election (~500ms)


Tx durability
Independent per-node, can be lost
Raft-replicated within cluster


Leader transition
2s+ timeout if proposer fails
Pre-connected, sub-500ms


Code change in CometBFT
N/A
~200 lines (4 files modified)


Performance Characteristics

From examining Mosaik internals:
Strengths:

Serialize-once fanout -- producer serializes a datum once, clones the Bytes buffer to all consumers.
No batching delay -- Raft leader appends commands immediately on arrival, broadcasts in the same tick. Commit latency is ~2 RTTs.
Independent consumer loops -- each subscriber has its own channel (default 1024 buffer) and sender task. Slow consumers don't block fast ones.
QUIC multiplexing -- lightweight per-stream QUIC connections, no head-of-line blocking between streams.

Limitations to address:

All-to-all bond mesh in groups -- O(n^2) connections. Groups are capped at 5 voting members. Fine for per-validator clusters, doesn't scale to hundreds.
Serial state machine application -- apply() is called one command at a time. For a high-throughput order book, you'd want batch matching per block.
500ms heartbeat / 2s election timeout defaults -- tunable, but the defaults are oriented toward availability groups, not sub-second block times.
Raft, not BFT -- crash fault tolerant only. Sufficient for per-validator clusters (trusted infrastructure) but not for open validator sets.

Commonware / Simplex Alternative

Commonware's Simplex consensus achieves 2-hop block times (vs CometBFT's 6-delta view changes) and threshold BLS signatures produce ~240 byte certificates regardless of validator set size. The same proposer-aware routing concept applies -- Simplex uses view-based leader rotation which is equally deterministic and pre-computable.
We built a parallel implementation using Commonware's published crates.
Implementation: zmanian/commonware-mempool -- Implements Automaton, CertifiableAutomaton, and Relay traits for Simplex BFT with leader schedule pre-computation.
Comparison with Commonware

Mosaik and Commonware come from different organizations with different design philosophies. Mosaik is Flashbots' experimental self-organizing p2p runtime. Commonware is an independent company founded by Patrick O'Grady (formerly VP of Platform Engineering at Ava Labs, where he led HyperSDK development) that builds an open-source "anti-framework" library of Rust blockchain primitives. Commonware raised $9M seed (Haun Ventures, Dragonfly) and $25M strategic (from Tempo, backed by Stripe and Paradigm), has ~4 customers each generating >$1M ARR, and its Simplex consensus is being adopted by Solana (via Alpenglow/Votor) and Tempo.


Aspect
Commonware
Mosaik


Organization
Independent company (Patrick O'Grady, ex-Ava Labs)
Flashbots project


Philosophy
Anti-framework: decoupled primitives, no opinions on block format/state/fees
Self-organizing runtime: typed streams, auto-discovery, integrated RSMs


Consensus
Simplex BFT (formally verified, ~300ms finality) + Threshold Simplex with embedded BLS12-381 DKG for cross-chain
Raft (CFT, ~2 RTT commit)


Networking
Custom authenticated p2p
iroh (QUIC + relay + mDNS)


Message model
Generic bytes ordered by consensus
Typed streams with auto-discovery and selective subscription


Crypto primitives
BLS12-381 threshold signatures, DKG, resharing, VRF, timelock encryption
None built-in


Self-organization
Manual topology
Gossip + tags + auto-discovery


Maturity
Production (17 primitives, 50+ dialects, 93% test coverage, 1500 daily benchmarks)
Experimental


Target
General-purpose blockchain infrastructure
Distributed systems / MEV infrastructure


Where They Complement Each Other

Mosaik's typed streams as a dissemination layer feeding Commonware consensus. Mosaik's type-aware routing and selective subscription could serve as a more efficient transaction dissemination mechanism for a Commonware-based chain.
Commonware's Simplex replacing Mosaik's Raft for BFT. Mosaik's biggest limitation for trustless environments is Raft's crash-fault-only model. Commonware's Simplex provides BFT with ~300ms finality. The RSM abstraction in both systems is similar enough that a Simplex backend could potentially be integrated into Mosaik's group primitive.
Threshold Simplex for cross-group bridging. Commonware's Threshold Simplex embeds BLS12-381 threshold cryptography directly into consensus, producing ~240-byte cross-chain certificates -- addressing cross-group atomicity.
Mosaik API Patterns Learned

These patterns emerged from building the working implementations:


Producer ordering matters -- Create stream producers BEFORE calling discover_all(). If peers sync catalogs before producers exist, the catalog entries won't include stream IDs and consumers won't find producers.


Tag propagation is not instant -- discovery.feed(signed_entry) updates the local catalog only. For other nodes to see tag changes immediately, broadcast the signed entry to each node directly:
for other_net in &other_networks {
    other_net.discovery().feed(signed_entry.clone());
}


Group is not Clone -- You can't share a Raft group handle across tasks. Design one node as the group operator, others communicate via streams.


Use Consistency::Weak on followers -- Consistency::Strong queries forward to the leader but can panic on followers. Wait for group.when().committed().reaches(target_index) then query with Consistency::Weak.


Create consumers before sending data -- If data is produced before the consumer is connected, it will be missed. Create consumers and wait for consumer.when().subscribed() before the producing side sends data.


Implementations


Repo
Description


zmanian/cometbft-mempool
9-phase Mosaik demo: 3 validator clusters (9 nodes), per-validator Raft, pipeline pre-connection, gas-aware block building


zmanian/cometbft
CometBFT Go fork with ProposerOracle interface and soft priority routing (~200 lines)


zmanian/commonware-mempool
Simplex BFT mempool with leader schedule pre-computation


What a Mosaik-Based Block Producer Looks Like

                    Mosaik Network
  +--------------------------------------------------+
  |                                                    |
  |  Stream<SwapOrder>  Stream<BridgeIntent>  ...     |
  |       |                    |                       |
  |       v                    v                       |
  |  +------------------------------------------+     |
  |  |  Proposer Cluster (3-node Raft group)     |     |
  |  |                                          |     |
  |  |  OrderBook   BridgeQueue   StakingState  |     |
  |  |     |            |             |          |     |
  |  |     v            v             v          |     |
  |  |        Unified Block Builder             |     |
  |  |              |                            |     |
  |  |    EndBlock: match, clear, settle         |     |
  |  +-------------|----------------------------+     |
  |                |                                   |
  |                v                                   |
  |  Stream<Block>  Stream<Fill>  Stream<Settlement>  |
  +--------------------------------------------------+
           |
           v
     Chain submission / DA layer

Open Questions


BFT upgrade path -- Can Mosaik's group abstraction be extended to support BFT consensus? The StateMachine trait is consensus-agnostic. Commonware's Simplex would be the natural candidate.


Cross-group atomicity -- If swaps and bridges run in different RSMs, how do you get atomic cross-group transactions? Commonware's Threshold Simplex (~240-byte cross-chain certificates) could address this.


Peer-to-validator address mapping -- CometBFT P2P node keys differ from validator addresses. The mapping is deployment-topology dependent and needs a pluggable resolver.


Sentry node awareness -- In production topologies where validators sit behind sentries, priority routing needs to propagate through sentries or sentries need their own schedule awareness.


ADR-118 mempool lanes interaction -- How does proposer-aware routing compose with the upcoming lane-based mempool? Lanes could provide the priority mechanism natively.


MEV in the RSM -- The Raft leader sees all commands before followers. Mitigations: encrypted command submission, threshold decryption, or TEE-based leaders.


Censorship resistance -- With typed streams and selective subscription, a matcher could censor specific order types. Need either forced inclusion rules in the RSM or a separate inclusion protocol.


Bottom Line

Mosaik's typed streams + Raft RSM primitives are a natural fit for building a proposer-aware mempool that eliminates flood gossip overhead and automates validator failover. The per-validator Raft cluster architecture provides the reliability of sentry node topologies with the added benefit of automatic failover and state replication. Pipeline pre-connection solves the latency problem for fast BFT rotation. And the ProposerOracle interface in CometBFT shows this can be integrated into existing infrastructure with minimal code changes (~200 lines) and full backward compatibility.
Mosaik (Flashbots) and Commonware (Patrick O'Grady / ex-Ava Labs) are independent projects with complementary strengths: Mosaik excels at type-aware dissemination and self-organizing network topology, while Commonware provides production-grade BFT consensus (Simplex), threshold cryptography for cross-chain attestation, and high-performance authenticated storage. A system combining Mosaik's typed stream layer with Commonware's consensus and crypto primitives could address most of the open questions above.
Tag	Meaning	Lead Time
`proposer`	Current slot's block builder	0ms
`proposer-next`	Next slot's proposer	500ms
`proposer-soon`	Slot+2 proposer	1000ms
Traditional Layer	Mosaik Equivalent	What Changes
Mempool gossip	Typed streams	Type-aware routing, selective subscription
Transaction selection (PrepareProposal)	RSM command intake	No separate proposal step -- commands flow directly into RSM
Block consensus	Raft group	Consensus on command ordering, not block blobs
ABCI execution	RSM `apply()`	No process boundary -- matching runs inline with consensus
Event emission	Output streams	Fills/settlements as typed streams, not opaque events
Aspect	CometBFT Status Quo	This Design
Tx routing	Flood gossip to all validators	Priority routing to proposer + next 2
Bandwidth	O(validators * txs)	O(3 * txs) for proposer tier
Proposer mempool freshness	Depends on gossip convergence	Direct delivery, always freshest
Validator failover	Manual restart	Automatic Raft election (~500ms)
Tx durability	Independent per-node, can be lost	Raft-replicated within cluster
Leader transition	2s+ timeout if proposer fails	Pre-connected, sub-500ms
Code change in CometBFT	N/A	~200 lines (4 files modified)
Aspect	Commonware	Mosaik
Organization	Independent company (Patrick O'Grady, ex-Ava Labs)	Flashbots project
Philosophy	Anti-framework: decoupled primitives, no opinions on block format/state/fees	Self-organizing runtime: typed streams, auto-discovery, integrated RSMs
Consensus	Simplex BFT (formally verified, ~300ms finality) + Threshold Simplex with embedded BLS12-381 DKG for cross-chain	Raft (CFT, ~2 RTT commit)
Networking	Custom authenticated p2p	iroh (QUIC + relay + mDNS)
Message model	Generic bytes ordered by consensus	Typed streams with auto-discovery and selective subscription
Crypto primitives	BLS12-381 threshold signatures, DKG, resharing, VRF, timelock encryption	None built-in
Self-organization	Manual topology	Gossip + tags + auto-discovery
Maturity	Production (17 primitives, 50+ dialects, 93% test coverage, 1500 daily benchmarks)	Experimental
Target	General-purpose blockchain infrastructure	Distributed systems / MEV infrastructure
Repo	Description
zmanian/cometbft-mempool	9-phase Mosaik demo: 3 validator clusters (9 nodes), per-validator Raft, pipeline pre-connection, gas-aware block building
zmanian/cometbft	CometBFT Go fork with ProposerOracle interface and soft priority routing (~200 lines)
zmanian/commonware-mempool	Simplex BFT mempool with leader schedule pre-computation