Single-node CockroachDB (n2-standard-16, 64GB RAM) running a KV workload at
~20% CPU with GOGC=off and GOMEMLIMIT=51GiB. The live heap is ~480MB, but
with GOGC disabled, the heap grows to ~50GB before GC triggers (driven entirely
by the memory limit). GC runs roughly every 24 seconds.
With GODEBUG=gctrace=1,gcpacertrace=1, each GC cycle produces pacer and gc
lines. Here's a representative cycle:
pacer: assist ratio=+1.671161e+000 (scan 3453 MB in 49839->49941 MB) workers=4+...
pacer: 32% CPU (25 exp.) for 172672272+1760600+2835722 B work (178060922 B exp.) ...
gc 11 @310.221s 0%: 0.20+17+0.14 ms clock, 3.2+20/69/56+2.2 ms cpu, 49839->49874->482 MB, ...
The relevant fields:
scan 3453 MB— the pacer's estimate of scannable heap bytes (gcController.heapScan)172672272+1760600+2835722 B work— actual scan work performed: 165MB heap + 1.7MB stack + 2.7MB globals ≈ 170MB total49839->49874->482 MB— heap at GC start → heap at GC end → live (marked) heap
| Value | Amount |
|---|---|
| Live heap (post-mark) | ~482 MB |
| Actual scan work | ~170 MB |
heapScan estimate |
~3,453 MB |
The pacer thinks it needs to scan 3.5GB, but the actual work is only 170MB —
a 20x overestimate. Over the first 12 cycles, heapScan slowly decreased from
5GB to 3.5GB, but never converged to the true value.
heapScan is maintained by two mechanisms
(mgcpacer.go):
-
At GC end (
resetLive, line 861):heapScanis set to the actual scan work from the just-completed cycle. At this point it's accurate (~170MB). -
Between GC cycles (
update, line 909): every heap allocation adds the new object's scannable bytes toheapScanviaheapScan.Add(dHeapScan). This happens outside of GC whengcBlackenEnabled == 0.
With GOGC=off and a 51GB memory limit, the heap grows from ~480MB to ~50GB
between GC cycles. Every object allocated during that window — regardless of
whether it's still alive — increases heapScan. Since the workload churns
through ~49GB of short-lived allocations between cycles, heapScan accumulates
the scannable portion of all of them.
By the time GC triggers, heapScan reflects all objects ever allocated since
the last cycle, not just the ones still alive. Most are garbage.
The pacer uses heapScan to compute the assist ratio (in revise):
assistWorkPerByte = scanWorkRemaining / heapRemaining
where scanWorkRemaining = heapScan - scanWorkComplete. An inflated heapScan
makes the pacer think there's far more scan work remaining than there actually
is, producing a higher assist ratio than necessary. This causes more frequent
and larger GC assists — goroutines are forced to do scan work during allocation
even though the dedicated mark workers could easily handle the actual ~170MB of
real work.
The gctrace output confirms this: despite the pacer budgeting for 3.5GB of scan work, the cycle completes after only 170MB of actual scanning, and the mark phase finishes in ~17ms wall-clock time. The assists that did occur were unnecessary overhead.
It's a known limitation of the pacer design. The heapScan estimate is
accurate under normal GOGC operation because the heap doesn't grow far beyond
the live set — the ratio of allocated-since-last-GC to live-heap is bounded by
GOGC (default 100%, so at most 2x). Under GOGC=off with GOMEMLIMIT, the heap
can grow to an arbitrary multiple of the live set (here ~100x), making the
between-cycle accumulation of heapScan wildly inaccurate.
The pacer does self-correct — resetLive sets heapScan to actual work at each
cycle end — but the correction is immediately undone by the next 49GB of
inter-cycle allocations.