Tobias Grieger tbg

## output.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / output.md
            
            
              Created
              March 11, 2026 12:35
            
              
                investigate workflow runs
              
          
    /investigate workflow runs

113 non-skipped runs from 2026-02-19 to 2026-03-11.


Date
Who
Result
Issue
Title
Run


2026-03-11
rafiss
success
#163431
roachtest: ruby-pg failed [liveness session expired before transaction]
run


2026-03-10
williamchoe3
success
#165212
pkg/sql/opt/opbench/opbench_test_/opbench_test: pkg failed
run


2026-03-10
dt
success
#164906
roachtest: backup-restore/online-restore failed
run


2026-03-10
dt
success
#165013
backup: TestBackupRestoreCrossTab


## heapscan.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / heapscan.md
            
            
              Created
              March 10, 2026 11:55
            
              
                heapScan overestimate under GOGC=off + GOMEMLIMIT
              
          
    heapScan overestimate under GOGC=off + GOMEMLIMIT

Setup

Single-node CockroachDB (n2-standard-16, 64GB RAM) running a KV workload at
~20% CPU with GOGC=off and GOMEMLIMIT=51GiB. The live heap is ~480MB, but
with GOGC disabled, the heap grows to ~50GB before GC triggers (driven entirely
by the memory limit). GC runs roughly every 24 seconds.
The log output


## gcpacer-tail.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / gcpacer-tail.md
            
            
              Created
              March 9, 2026 13:09
            
              
                GC + pacer trace tail from tobias-gcassist (2026-03-09)
              
          
    GC + Pacer trace tail (tobias-gcassist, 2026-03-09 ~12:45 UTC)

Single-node CockroachDB (n2-standard-16), KV workload at ~20-25% CPU.
GODEBUG=gctrace=1,gcpacertrace=1.
pacer: assist ratio=+1.966144e+000 (scan 226 MB in 1660->1736 MB) workers=4++0.000000e+000
pacer: 27% CPU (25 exp.) for 151835216+1501680+2831530 B work (155682658 B exp.) in 1741434408 B -> 1766226960 B (∆goal -54389166, cons/mark +1.702709e-001)
gc 20311 @11135.890s 0%: 0.099+11+0.098 ms clock, 1.5+5.1/45/49+1.5 ms cpu, 1660->1684->434 MB, 1736 MB goal, 1 MB stacks, 2 MB globals, 16 P
pacer: sweep done at heap size 458MB; allocated 23MB during sweep; swept 218348 pages at +1.681737e-004 pages/byte


## gc-assist-metric-issue.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / gc-assist-metric-issue.md
            
            
              Created
              March 9, 2026 12:28
            
              
                OTel Datadog exporter inflates counter metric rates by ~3x
              
          
    OTel Datadog exporter inflates counter metric rates by ~3x

Summary

The Datadog cockroachdb.sys.gc.assist.ns metric (and likely all Prometheus
counter-type metrics) reports a rate ~3x higher than the actual rate when using
.as_rate(). The root cause appears to be a mismatch between the OTel
Prometheus scrape interval (30s) and the interval metadata submitted to Datadog
by the OTel Datadog exporter (suspected 10s, matching the batch processor
timeout).

  
## experiment.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / experiment.md
            
            
              Last active
              March 9, 2026 13:14
            
              
                Single-Node KV Workload Experiment with GC tracing analysis
              
          
    Single-Node KV Workload Experiment

2026-03-09T09:23:05Z by Showboat dev

Create a single-node CockroachDB cluster with OpenTelemetry and fluent-bit for Datadog observability, then run a KV workload targeting ~20-25% CPU.
Cluster Creation


## review-pr-164900.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / review-pr-164900.md
            
            
              Created
              March 5, 2026 09:54
            
              
                Review of cockroachdb/cockroach PR #164900: mmaintegration: introduce physical capacity model
              
          
    Review: PR #164900 — mmaintegration: introduce physical capacity model

Author: wenyihu6 | Branch: oldmodel2 | Epic: CRDB-55052
Blocking Issues (must fix)


[correctness] highDiskSpaceUtilization comment is now stale (capacity_model.go:703-724): The comment explains that fractionUsed = load/capacity = LogicalBytes / (LogicalBytes / diskUtil) = diskUtil. Under the new model, load=Used, capacity=Used+Available — the math still recovers actual disk utilization, but the comment references the old LogicalBytes-based derivation and is now misleading.


[correctness] minCapacity floor is dramatically lower than the old floor (physical_model.go): The old model had cpuCapacityFloorPerStore = 0.1 * 1e9 (0.1 cores). The new minCapacity = 1.0 means 1 ns/s — effectively zero CPU capacity. The old floor existed to prevent utilization from going to infinity on overloaded nodes (its comment explains this in detail). If a store has non-zero load and capacity=1 ns/s, utilization


## review.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / review.md
            
            
              Created
              March 4, 2026 10:15
            
              
                review-crdb skill example: PR #161454 (engine separation ReadWriter)
              
          
    Review: PR #161454 — kvserver: thread in correct engine when destroying and subsuming replicas

Summary

This PR replaces two uses of kvstorage.TODOReadWriter(b.batch) in
replicaAppBatch.runPostAddTriggersReplicaOnly with a new
b.ReadWriter() helper that correctly separates the state engine batch
(b.batch) from the raft engine batch (b.RaftBatch()). This is part of the
broader effort to logically separate the state and raft engines in the apply
stack (issue #161059). The change is correct, small, and follows the pattern

  
## review.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / review.md
            
            
              Created
              March 4, 2026 10:15
            
              
                review-crdb skill example: PR #79134 (SKIP LOCKED implementation)
              
          
    Review: PR #79134 — kv: support FOR {UPDATE,SHARE} SKIP LOCKED

Summary

This PR implements the KV portion of SKIP LOCKED support for
SELECT ... FOR UPDATE SKIP LOCKED and SELECT ... FOR SHARE SKIP LOCKED.
The change spans the MVCC scanner, KV concurrency control, optimistic
evaluation, timestamp cache, refresh spans, and the lock table. The SQL
optimizer still rejects SKIP LOCKED (the SQL portion was extracted into
a separate PR, #83627), so this is plumbing-only from the KV side.

  
## review.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / review.md
            
            
              Created
              March 4, 2026 10:15
            
              
                review-crdb skill example: PR #164677 (connection retry roachtest)
              
          
    Review: PR #164677 — changefeedccl: add roachtest for CDC rolling restarts with KV workload

Summary

This PR adds a roachtest that exercises changefeeds during rolling node
drain+restart cycles and introduces a COCKROACH_CHANGEFEED_TESTING_SLOW_RETRY
env var for reaching max backoff behavior quickly. The test is well-structured
and the motivation is clear. There are a few structural and correctness issues
worth addressing.

  
## review.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                tbg
                / review.md
            
            
              Created
              March 4, 2026 10:15
            
              
                review-crdb skill example: PR #164792 (physical modeling in simulator)
              
          
    Review: PR #164792 — mmaintegration: introduce physical capacity model

Summary

This PR introduces a physical capacity model for MMA that expresses store loads
and capacities in physical resource units (CPU ns/s, disk bytes) and threads
amplification factors through all range-load callsites. It is a well-structured,
well-documented change with excellent commit messages. The core algebraic claim
(load/capacity ratio preservation) is correct. There are a few issues worth
addressing, the most important being a missing capacity floor that changes
Date	Who	Result	Issue	Title	Run
2026-03-11	rafiss	success	#163431	roachtest: ruby-pg failed [liveness session expired before transaction]	run
2026-03-10	williamchoe3	success	#165212	pkg/sql/opt/opbench/opbench_test_/opbench_test: pkg failed	run
2026-03-10	dt	success	#164906	roachtest: backup-restore/online-restore failed	run
2026-03-10	dt	success	#165013	backup: TestBackupRestoreCrossTab