zboralski/trigram_clustering_results.md

## trigram_clustering_results.md

      
    Raw
  

              trigram_clustering_results.md
            
          
    MidALU Trigram Clustering Results (2026-03-06)

Method

For every occurrence of a target tag in the corpus (1.49M HiALU + all MidALU), extract a trigram window:
[prev_form] TARGET [next_form]

Where forms are: LowALU, MidALU, HiALU, MemHi, MemOp, Reg, ExtALU, Branch, MidCatch, CF.
Represent each tag as a normalized frequency vector over form-level trigrams, then compute cosine similarity.
Dataset


260 disasm files (imac_flat, macos26, system, xcode_beta, icon_composer, gpu, docs)
11 unprobed MidALU tags analyzed against 16 known baseline tags

Cluster Results

Cluster 1: Standard ALU Pipeline

Tags that sit in generic MidALU↔MidALU, MidALU↔LowALU, MidALU↔HiALU neighborhoods. High cosine (>0.93) with known arithmetic tags.


Unprobed
Base
Modifier
Closest Known
Cosine
Family


mid_alu_b5
0x15
mod 5
mid_alu_03
0.959
ALU general


mid_alu_b9
0x19
mod 5
mid_alu_1d
0.958
ALU general


mid_alu_f9
0x19
mod 7
mid_alu_04
0.932
ALU general


mid_alu_ff
0x1f
mod 7
mid_alu_03
0.908
ALU general


b5 and b9 cluster together at 0.977 cosine — they are the same functional family with different base opcodes. Their mod-0 bases (0x15, 0x19) also cluster tightly (0.970).
Cluster 2: Data Movement / Memory-Adjacent


Unprobed
Base
Modifier
Closest Known
Cosine
Family


mid_alu_2a
0x0a
mod 1
mid_alu_0a
0.984
ALU feed (nearly identical to base)


mid_alu_d3
0x13
mod 6
mid_alu_03
0.896
ALU with heavy MemHi context


mid_alu_36
0x16
mod 1
mid_alu_03
0.907
ALU config (more HiALU context)


mid_alu_d3 is distinct: 8.8% of its trigrams involve MemHi↔MemHi patterns (memory load/store chains). Top prevs include mem_hi_6=274, mem_hi_e=208.
Cluster 3: Branch-Adjacent / Control Flow


Unprobed
Base
Modifier
Closest Known
Cosine
Family


mid_alu_9b
0x1b
mod 4
mid_alu_11
0.859
Branch-heavy context


mid_alu_bd
0x1d
mod 5
mid_alu_01
0.810
Compare+branch pattern


mid_alu_9b signature: 8.9% branch→TARGET→branch trigrams. Top prev: branch_9a=3451. Top next: branch_9a=4010. Chains with itself: mid_alu_9b→mid_alu_9b→branch_9a.
mid_alu_bd signature: 5.8% TARGET→Branch, 14% TARGET→HiALU. Top pattern: low_alu_002b→mid_alu_bd→branch_10 (5.8%).
Cluster 4: Register Pipeline (Outlier)


Unprobed
Base
Modifier
Closest Known
Cosine
Family


mid_alu_d1
0x11
mod 6
mid_alu_19
0.508
Unique: register sync


Genuine outlier. Trigram profile:

22.9% MidALU → TARGET → Reg
15.0% HiALU → TARGET → Reg
8.7% Branch → TARGET → Reg
53% of next instructions are reg words

This is consistent with TG_FENCE_M6 — a threadgroup fence/barrier that requires register file synchronization. The following reg word likely encodes barrier metadata.
Cluster 5: ALU Pipeline Step


Unprobed
Base
Modifier
Closest Known
Cosine
Family


mid_alu_ad
0x0d
mod 5
mid_alu_11
0.756
Mid-pipeline step


Unique signature: low_alu_0005→TARGET→hi_alu_81 (10.3%), low_alu_0004→TARGET→hi_alu_81 (6.0%). A MidALU step that sits between LowALU setup and HiALU consumer. Top next: hi_alu_81=1987.
Specific Behavioral Signatures

mid_alu_ff (FDIV_COMPARE_AUX)

64.6% of next instructions are mid_alu_0c. This is the tightest pairing in the corpus.
TOP NEXT: mid_alu_0c=3892 (64.6%), low_alu_000e=266, low_alu_001e=212
TOP PREV: low_alu_0000=734, reg=624, low_alu_catch=315

Confirmed: a divider pipeline micro-op that always feeds into mid_alu_0c.
mid_alu_9b (ALU_CHAIN_SETUP_M4)

Heavily self-chaining in branch contexts:
8.9%  branch_9a → mid_alu_9b → branch_9a
6.0%  mid_alu_9b → mid_alu_9b → branch_9a
4.5%  branch_9a → mid_alu_9b → mid_alu_9b
2.6%  mid_alu_9b → mid_alu_9b → mid_alu_9b

A conditional chain setup that runs in loops with branches. 23,914 corpus occurrences.
mid_alu_d3 (ALU_REG_CONFIG_M6)

Heavy memory context:
4.0%  mem_hi_6 → TARGET → mem_hi_4
3.5%  mem_hi_e → TARGET → mem_hi_e
TOP PREV: mem_hi_6=274, mem_hi_e=208

A register bank reconfiguration between memory operations. 2,753 corpus occurrences.
mid_alu_2a (ALU_FEED_M1)

Nearly identical to its base mid_alu_0a (cosine=0.984):
TOP PREV: mid_alu_1a=5635, mid_alu_3a=4233, mid_alu_5a=3733
TOP NEXT: mid_alu_34=5653, low_alu_catch=5243, mem_hi_4=4636

Same pipeline position as 0x0a, just modifier variant. 73,868 corpus occurrences (not rare at all).
Cosine Similarity Matrix (top pairs, >0.95)

mid_alu_13  ↔ mid_alu_15   0.992
mid_alu_15  ↔ mid_alu_1b   0.991
mid_alu_13  ↔ mid_alu_1b   0.984
mid_alu_0a  ↔ mid_alu_2a   0.984   ← unprobed matches base
mid_alu_15  ↔ mid_alu_1d   0.984
mid_alu_13  ↔ mid_alu_1d   0.984
mid_alu_1b  ↔ mid_alu_1d   0.981
mid_alu_b5  ↔ mid_alu_b9   0.977   ← unprobed pair clusters
mid_alu_03  ↔ mid_alu_04   0.974
mid_alu_03  ↔ mid_alu_b5   0.959   ← unprobed matches known
mid_alu_1d  ↔ mid_alu_b9   0.958   ← unprobed matches known

Conclusions


7 of 11 unprobed tags are standard ALU pipeline variants (cosine >0.90 with known tags). They don't need oracle probing — they're modifier variants of well-understood base opcodes.


mid_alu_d1 is the most interesting — a genuine outlier that always precedes reg words. TG_FENCE_M6 label is well-supported.


mid_alu_ff always feeds mid_alu_0c — divider pipeline confirmed.


mid_alu_9b lives in branch loops — conditional chain setup confirmed.


Trigram clustering recovers instruction families without knowing instruction semantics, purely from compiler scheduling patterns.


K-Means Clustering (k=5, 106 MidALU tags, ≥100 occurrences)

Form-level trigram vectors, L2-normalized, k-means++ initialization, 20 trials.
Cluster 0: LowALU Feed Chain (11 tags)

Signature: [LowALU] TARGET [MidALU] dominant (83%)
Tags: mid_alu_{3a, 5a, 72, 7a, 7f, 92, b2, d2, da, f2, fa}
These are MidALU instructions that primarily receive input from LowALU — pipeline feed operations. All modifier variants (mod 1-7) of a small set of base opcodes.
Cluster 1: General ALU (78 tags)

Signature: [MidALU] TARGET [MidALU] dominant (75%)
The largest cluster. Contains most arithmetic, logic, and configuration tags including all 11 unprobed targets except mid_alu_ad and mid_alu_d1. These sit in MidALU↔MidALU chains — standard ALU pipeline instructions.
Cluster 2: Load-Execute Bridge (7 tags)

Signature: [LowALU] TARGET [HiALU] dominant (62%)
Tags: mid_alu_{78, 95, ad, b0, b8, f0, f8}
MidALU instructions that bridge LowALU setup to HiALU execution. mid_alu_ad (ALU_MID_STEP_M5) falls here — confirmed as pipeline step between source setup and ALU consumer.
Cluster 3: Register Sync (4 tags)

Signature: [MidALU] TARGET [Reg] dominant (51%)
Tags: mid_alu_{34, b1, d1, d5}
MidALU instructions that almost always precede reg words. mid_alu_d1 (TG_FENCE_M6) confirmed here. mid_alu_34 is the high-volume anchor (101K occurrences). This cluster represents operations requiring register file metadata words.
Cluster 4: LowALU Interleave (6 tags)

Signature: [LowALU] TARGET [LowALU] dominant (78%)
Tags: mid_alu_{09, 0b, 50, 70, a0, a4}
MidALU instructions sandwiched between LowALU instructions. These are inline data or barrier-like tags that sit within LowALU instruction streams without disrupting them. mid_alu_a0 is the high-volume member (66K occurrences).
Inter-Cluster Distances


Pair
Cosine


C1 ↔ C2
0.705


C0 ↔ C1
0.691


C1 ↔ C4
0.653


C1 ↔ C3
0.604


C0 ↔ C2
0.528


Clusters 3 (Reg Sync) and 4 (LowALU Interleave) are the most distinctive — lowest similarity to other clusters.
Modifier Interaction Matrix (2026-03-06)

Method

For every MidALU instruction in the corpus, extract base = tag[4:0] and modifier = tag[7:5]. Build:

Base × Modifier frequency table
Per base+mod: predecessor/successor form distribution
Modifier transition matrix (consecutive MidALU pairs)
Clause patterns (modifier sequences of length 2-3)

Tool: tools/modifier_matrix.go
Base × Modifier Structure

Two tiers of base opcodes:

Simple bases (0x00-0x0f): only mod0 + mod1 (plus occasional mod5). Core scalar ALU.
Rich bases (0x10-0x1f): up to 8 modifiers. Complex pipeline ops (FMA, conversion, compare, fence).

Widest modifier spread: base 0x1a (459K instances, 7 modifiers), 0x12 (283K, all 8 mods), 0x14 (154K, 7 mods).
Modifier Transition Matrix

prev\next  mod0   mod1   mod2   mod3   mod4   mod5   mod6   mod7
mod0      64.1%  24.6%   1.9%   1.6%   2.4%   3.7%   0.9%   0.9%  (n=957K)
mod1      52.4%  35.6%   2.6%   1.6%   2.4%   3.5%   0.9%   1.0%  (n=405K)
mod2      59.9%  33.8%   3.0%   0.8%   0.8%   1.2%   0.3%   0.3%  (n=67K)
mod3      68.7%  24.5%   0.7%   2.3%   0.6%   2.5%   0.4%   0.3%  (n=53K)
mod4      57.6%  22.9%   1.4%   0.8%  11.8%   4.0%   0.4%   1.1%  (n=42K)
mod5      60.6%  25.8%   2.1%   1.3%   2.5%   6.9%   0.4%   0.4%  (n=54K)
mod6      58.0%  38.3%   1.2%   0.3%   0.2%   0.7%   1.2%   0.1%  (n=29K)
mod7      67.0%  28.9%   0.6%   0.5%   0.7%   1.3%   0.1%   1.0%  (n=34K)

Key patterns:

mod0→mod0 dominates (64.1%) — mod0 is the default scheduling slot
mod1 is the secondary slot (25-38% as successor)
mod4 has 11.8% self-affinity — runs in chains (branch-loop pattern)
mod6→mod1 elevated (38.3%) — mod6 almost always transitions to mod0/mod1

Pipeline Position Gradient (Base 0x12)

The most informative base shows modifier encodes pipeline position:


Modifier
LowALU prev
LowALU next
Reg next
Interpretation


mod0
11.3%
28.1%
9.3%
Standard middle-pipe


mod1
11.0%
28.5%
5.7%
Similar to mod0


mod2
25.7%
9.2%
8.5%
More LowALU-fed


mod3
36.8%
7.4%
8.9%
Heavy LowALU source


mod4
40.9%
9.2%
12.8%
LowALU→Reg bridge


mod5
48.0%
5.5%
15.2%
Source setup stage


mod6
47.9%
1.7%
19.3%
Near-terminal


mod7
61.0%
0.5%
25.7%
Pipeline terminus


mod7 is a pipeline terminus: 61% LowALU predecessors, 0.5% LowALU successors, 25.7% Reg successors. The modifier gradient encodes where in the LowALU→MidALU→HiALU clause the instruction sits.
Clause Patterns

Top patterns (consecutive MidALU modifier sequences):


Pattern
Count


m0→m0
614K


m0→m1
236K


m1→m0
212K


m0→m0→m0
205K


m1→m1
144K


m0→m1→m0
69K


m1→m0→m1
52K


The m0↔m1 alternation accounts for the bulk of scheduling. Higher modifiers (2-7) are specialist inserts in predominantly mod0/mod1 streams.
Confirmations


Base 0x11 mod6 (mid_alu_d1 = TG_FENCE): 53.3% Reg next, 9.4% Branch prev — independently confirms register sync role from trigram clustering.
Base 0x14 mod1: 74.7% MidALU prev, 35.0% Reg next — distinctive register pipeline stage.
Base 0x10 mod7: 49.8% HiALU next, 30.6% Reg prev — HiALU feeder from register file.

Conclusions


Modifier = pipeline position, not instruction variant. Higher modifiers sit closer to the pipeline terminus (LowALU source → Reg/HiALU sink).
mod0/mod1 are the scheduling backbone — 88.7% of all transitions stay within mod0↔mod1.
mod4 is a loop-body specialist — 11.8% self-transition (branch-loop confirmed by mid_alu_9b chains).
mod6/mod7 are pipeline-terminal — they feed into Reg words or HiALU consumers with minimal LowALU continuation.
The AGX clause grammar is: (mod0|mod1)* [mod2-7_specialist]? (mod0|mod1)* — specialist modifiers are injected into mod0/mod1 streams.

Files


Tools: tools/trigram_cluster.go, tools/trigram_kmeans.go, tools/modifier_matrix.go
Raw aux6 stats: scratch/aux_field_corpus_stats.json
aux6 analysis: scratch/aux6_corpus_analysis.md
Unprobed	Base	Modifier	Closest Known	Cosine	Family
mid_alu_b5	0x15	mod 5	mid_alu_03	0.959	ALU general
mid_alu_b9	0x19	mod 5	mid_alu_1d	0.958	ALU general
mid_alu_f9	0x19	mod 7	mid_alu_04	0.932	ALU general
mid_alu_ff	0x1f	mod 7	mid_alu_03	0.908	ALU general
Unprobed	Base	Modifier	Closest Known	Cosine	Family
mid_alu_2a	0x0a	mod 1	mid_alu_0a	0.984	ALU feed (nearly identical to base)
mid_alu_d3	0x13	mod 6	mid_alu_03	0.896	ALU with heavy MemHi context
mid_alu_36	0x16	mod 1	mid_alu_03	0.907	ALU config (more HiALU context)
Unprobed	Base	Modifier	Closest Known	Cosine	Family
mid_alu_9b	0x1b	mod 4	mid_alu_11	0.859	Branch-heavy context
mid_alu_bd	0x1d	mod 5	mid_alu_01	0.810	Compare+branch pattern
Pair	Cosine
C1 ↔ C2	0.705
C0 ↔ C1	0.691
C1 ↔ C4	0.653
C1 ↔ C3	0.604
C0 ↔ C2	0.528
Modifier	LowALU prev	LowALU next	Reg next	Interpretation
mod0	11.3%	28.1%	9.3%	Standard middle-pipe
mod1	11.0%	28.5%	5.7%	Similar to mod0
mod2	25.7%	9.2%	8.5%	More LowALU-fed
mod3	36.8%	7.4%	8.9%	Heavy LowALU source
mod4	40.9%	9.2%	12.8%	LowALU→Reg bridge
mod5	48.0%	5.5%	15.2%	Source setup stage
mod6	47.9%	1.7%	19.3%	Near-terminal
mod7	61.0%	0.5%	25.7%	Pipeline terminus
Pattern	Count
m0→m0	614K
m0→m1	236K
m1→m0	212K
m0→m0→m0	205K
m1→m1	144K
m0→m1→m0	69K
m1→m0→m1	52K