Stella Laurenzo stellaraccident

## wheel-split-report.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                stellaraccident
                / wheel-split-report.md
            
            
              Last active
              March 10, 2026 02:08
            
              
                PyTorch ROCm wheel split report — torch 2.10.0+rocm7.1 (v2: hierarchy-aware bundling, xnack collapse, METADATA extras/variants)
              
          
    PyTorch ROCm Wheel Split Report

Split of torch-2.10.0+rocm7.1-cp313-cp313-manylinux_2_28_x86_64.whl using rocm_kpack.tools.split_python_wheels.
Summary


Metric
Value


Input wheel (.whl)
5.1 GB


Host wheel (.whl)
431 MB


## therock-shared-python-for-nuitka.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                stellaraccident
                / therock-shared-python-for-nuitka.md
            
            
              Last active
              January 29, 2026 21:15
            
              
                TheRock shared libpython for Nuitka/rocprofiler-compute
              
          
    TheRock Already Provides Shared libpython for Nuitka/Embedding

Summary

Yes, TheRock already builds Python with --enable-shared for embedding use cases. The infrastructure exists and is used by rocgdb. rocprofiler-compute just needs to wire into it.
Current Infrastructure

1. install_shared_pythons.sh builds Python with shared library support


## install-rocm-asan-artifacts.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                stellaraccident
                / install-rocm-asan-artifacts.md
            
            
              Created
              January 29, 2026 18:16
            
              
                Installing ROCm from TheRock ASAN CI Artifacts
              
          
    Installing ROCm from TheRock ASAN CI Artifacts

Source: CI ASAN Run #21463906609
Artifacts Index: https://therock-ci-artifacts.s3.amazonaws.com/21463906609-linux/index-gfx94X-dcgpu-asan.html
Basic Installation

From TheRock repo root:

  
## mixtral_compile_test.mlir
#map = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
#map1 = affine_map<(d0, d1) -> (d1, d0)>
#map2 = affine_map<(d0, d1) -> (d0, d1)>
#map3 = affine_map<(d0, d1) -> (d1)>
#map4 = affine_map<(d0, d1, d2) -> (d2, d0)>
#map5 = affine_map<(d0, d1, d2) -> (d1, d2)>
#map6 = affine_map<(d0, d1, d2) -> (d0, d2)>
#map7 = affine_map<(d0, d1, d2) -> (d2, d0, d1)>
#map8 = affine_map<(d0, d1, d2) -> (d1, d2, d0)>
#map9 = affine_map<(d0, d1, d2) -> (d2)>

## quartz-design.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                stellaraccident
                / quartz-design.md
            
            
              Created
              January 26, 2026 20:58
            
              
                Quartz Design Document: ROCm CI/CD Dashboard & Orchestration
              
          
    Quartz Design Document: ROCm CI/CD Dashboard & Orchestration

Executive Summary

This document provides architectural guidance for "Quartz" - a PyTorch HUD-like system for ROCm downstream CI/CD orchestration. The junior engineer's instinct to start with status.json is understandable but insufficient for the stated requirements. A database-first approach is correct.

Requirements Recap


## purrfect-crunching-glade.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                stellaraccident
                / purrfect-crunching-glade.md
            
            
              Created
              January 26, 2026 04:10
            
              
                MoE components plan for frank-models (iree-link composition)
              
          
    Plan: Add MoE Components (mul_mat_id, moe_ffn_block)

Overview

Extract MoE primitives from /develop/ai-no-fluff/kb/ben/moe_f32_parameterized.mlir:

mul_mat_id - Expert-selected matrix multiplication (gather + batch_matmul)
moe_ffn_block - Full MoE FFN block composing routing, expert compute, weighted sum

Key challenge: moe_ffn_block depends on mul_mat_id and swiglu. Need systematic composition without manual inlining.

  
## gist:4d3b5d24077b17ebfb8137bf3ad8135f

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                stellaraccident
                / gist:4d3b5d24077b17ebfb8137bf3ad8135f
            
            
              Last active
              January 22, 2026 01:18
            
              
                Why merge commits are better than pushing hundreds of commits to main
              
          
    Why Merge Commits Beat Pushing Hundreds of Commits to Main

The Comparison


Merge Commit
Individual Commits on Main


One atomic integration point
500 commits sprawled on main


git revert -m1 <merge> undoes everything
Good luck reverting


git bisect can skip the whole merge
Bisect walks through 500 commits


Main history is readable
Main history is chaos


## gist:f31fee06c22e6d172335f2ae881d7ff5

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                stellaraccident
                / gist:f31fee06c22e6d172335f2ae881d7ff5
            
            
              Last active
              January 8, 2026 04:54
            
              
                TheRock: Developer builds from PRs and PyTorch packages
              
          
    TheRock: Developer Builds from PRs and PyTorch Packages

Claude Code Prompt: Read the docs and workflows in therock using subagents and tell me how to make
developer builds from a PR and get packages w/ pytorch.
Triggering PR Builds

Label-based (simplest)

Add labels to your PR:

  
## procedure.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                stellaraccident
                / procedure.md
            
            
              Last active
              January 12, 2022 21:36
            
              
                IREE LLVM Integration Procedure
              
          
    Notes on integrating LLVM from the OSS side

Strategy 1: Sync everything to a Google/TensorFlow commit

cd ~/src
git clone git clone https://github.com/tensorflow/tensorflow.git
git clone https://github.com/tensorflow/mlir-hlo.git


## 0_original.mlir
module @aqt_matmul  {
  iree_input.global private @_params$0 = dense<[[0.000000e+00, 5.003000e+02, 1.000600e+03], [1500.8999, 2.001200e+03, 2.501500e+03], [3001.7998, 3502.09985, 4.002400e+03], [4502.69971, 5.003000e+03, 5.503300e+03], [6003.59961, 6503.8999, 7004.1997], [7.504500e+03, 8004.7998, 8.505100e+03]]> : tensor<6x3xf32>
  iree_input.global private @_params$1 = dense<5.000000e+00> : tensor<f32>
  func @compute_native(%arg0: tensor<5x6xf32>) -> tensor<5x3xf32> {
    %0 = iree_input.global.load @_params$0 : tensor<6x3xf32>
    %1 = iree_input.global.load @_params$1 : tensor<f32>
    %2 = call @main(%0, %1, %arg0) : (tensor<6x3xf32>, tensor<f32>, tensor<5x6xf32>) -> tensor<5x3xf32>
    return %2 : tensor<5x3xf32>
  }
  func private @main(%arg0: tensor<6x3xf32>, %arg1: tensor<f32>, %arg2: tensor<5x6xf32>) -> tensor<5x3xf32> {
	#map = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	#map1 = affine_map<(d0, d1) -> (d1, d0)>
	#map2 = affine_map<(d0, d1) -> (d0, d1)>
	#map3 = affine_map<(d0, d1) -> (d1)>
	#map4 = affine_map<(d0, d1, d2) -> (d2, d0)>
	#map5 = affine_map<(d0, d1, d2) -> (d1, d2)>
	#map6 = affine_map<(d0, d1, d2) -> (d0, d2)>
	#map7 = affine_map<(d0, d1, d2) -> (d2, d0, d1)>
	#map8 = affine_map<(d0, d1, d2) -> (d1, d2, d0)>
	#map9 = affine_map<(d0, d1, d2) -> (d2)>
Merge Commit	Individual Commits on Main
One atomic integration point	500 commits sprawled on main
`git revert -m1 <merge>` undoes everything	Good luck reverting
`git bisect` can skip the whole merge	Bisect walks through 500 commits
Main history is readable	Main history is chaos
	module @aqt_matmul {
	iree_input.global private @_params$0 = dense<[[0.000000e+00, 5.003000e+02, 1.000600e+03], [1500.8999, 2.001200e+03, 2.501500e+03], [3001.7998, 3502.09985, 4.002400e+03], [4502.69971, 5.003000e+03, 5.503300e+03], [6003.59961, 6503.8999, 7004.1997], [7.504500e+03, 8004.7998, 8.505100e+03]]> : tensor<6x3xf32>
	iree_input.global private @_params$1 = dense<5.000000e+00> : tensor<f32>
	func @compute_native(%arg0: tensor<5x6xf32>) -> tensor<5x3xf32> {
	%0 = iree_input.global.load @_params$0 : tensor<6x3xf32>
	%1 = iree_input.global.load @_params$1 : tensor<f32>
	%2 = call @main(%0, %1, %arg0) : (tensor<6x3xf32>, tensor<f32>, tensor<5x6xf32>) -> tensor<5x3xf32>
	return %2 : tensor<5x3xf32>
	}
	func private @main(%arg0: tensor<6x3xf32>, %arg1: tensor<f32>, %arg2: tensor<5x6xf32>) -> tensor<5x3xf32> {