Gusto's GraphQL resolvers extensively use a pattern called loadAndDispatch() that defeats DataLoader batching, causing every chained DataLoader call to dispatch with a batch size of 1. This was identified as the primary driver of DataLoaderHelper being the #1 allocation source in production (48,910 samples / 11.6% of all allocations).
An async-profiler allocation capture (-e alloc, 30 seconds, 463K samples) on a production gusto instance revealed that DataLoaderHelper dominated all allocation sources:
| Source | Samples | % Total |
|---|---|---|
| DataLoaderHelper | 48,910 | 11.6% |
| PacsAuthzInstrumentation | 36,108 | 8.6% |
| ContextDataFetcherDecorator | 28,550 | 6.8% |
| HashMap.resize | 25,021 | 5.4% |
A codebase-wide search found 119 call sites across 42 files using the loadAndDispatch pattern:
// From ImageDataLoader.java — the pattern used everywhere
public static CompletableFuture<List<ImageResult>> loadAndDispatch(
DataFetchingEnvironment env, ImageDataLoaderKey key) {
var loader = get(env);
var future = loader.load(key);
loader.dispatch(); // <-- immediately dispatches with batch size of 1
return future;
}Every DataLoader that supports batching (VideoCore, Images, ECL, Collections, Persons, etc.) is being called with batch size 1 when invoked through this pattern — meaning the downstream gRPC services receive N individual requests instead of 1 batched request with N keys.
In graphql-java (pre-v25), DataLoader dispatch happens at the end of each field resolution level. When a DataFetcher chains into another DataLoader via .thenCompose(), the second load() call happens after the framework has already dispatched that level. Without manual dispatch, the request hangs forever:
// ShowDataFetcher.currentEpisode — real example from gusto
return EvidenceLoader.loadCurrentEpisode(env, unifiedEntityId)
// This runs AFTER framework dispatch — framework won't dispatch again
.thenCompose(episodeId -> VideoCoreDataLoader.loadAndDispatch(env, episodeId));
// ^^^^^^^^^^^^^^^^
// manual dispatch forces batch-of-1 to avoid hangingThe comment in the codebase explains this directly:
"Unfortunately, we currently must manually dispatch a dataloader when it is composed after another Future. Otherwise, the thread will hang indefinitely waiting for something to manually dispatch it."
Each loadAndDispatch call triggers a full DataLoaderHelper.dispatch() cycle including:
- Creating new
CompletableFuturechains per dispatch - Invoking the
MappedBatchLoader.load()with a single key - gRPC call setup, serialization, and response handling per individual request
With 119 call sites and many executing multiple times per GraphQL request (e.g., EvidenceLoader has 26 references), this multiplies to dozens of dispatch cycles per request, most with batch size 1.
DataLoaders that support batching (e.g., ECLFetchEvidenceDataLoader, VideoCoreDataLoader, ImageDataLoader) are designed to batch N keys into a single gRPC call. The loadAndDispatch pattern defeats this — a page with 40 videos generates 40 individual gRPC calls instead of 1 batched call.
A dev canary using graphql-java 25 beta with chained DataLoaders (applied only to live event prefetching) showed:
- Statistically significant p50 latency improvement — even with chaining applied to just one small pocket of the app
- Clear reduction in downstream gRPC RPS to
EvidenceControlLayerServicedue to better batching - Flat or slightly improved CPU — fewer, larger batch calls are more efficient than many individual calls
graphql-java 25 introduces native support for chained DataLoaders that eliminates the hanging problem entirely.
How it works: The engine automatically detects when a DataLoader load() is called inside a .thenCompose() chain and schedules dispatch appropriately, allowing multiple chained loads to batch together.
// With graphql-java 25 — no manual dispatch needed, batching works
return EvidenceLoader.loadCurrentEpisode(env, unifiedEntityId)
.thenCompose(episodeId -> VideoCoreDataLoader.load(env, episodeId));
// ^^^^
// plain load() — framework handles dispatch + batchingEnabling:
GraphQL graphQL = GraphQL.unusualConfiguration(graphqlContext)
.dataloaderConfig()
.enableDataLoaderChaining(true);Status: PR #4023 validated this approach on a dev canary with positive results. Currently applied only to live event prefetching.
Reference: graphql-java Chained DataLoaders docs
Dependency: graphql-java 25 is obtained through DGS framework, which is tied to SBN. The official path is SBN4 (targeting early 2026). Forcing the dependency independently via resolutionStrategy.force is possible but unsupported by the platform team until SBN4 GA.
Slack thread: #dna-gusto-dev discussion
DGS provides a ScheduledDataLoaderRegistry that polls on a configurable interval and dispatches any DataLoaders with pending keys.
dgs:
graphql:
dataloader:
ticker-mode-enabled: true
schedule-duration: 10ms # defaultPros:
- Simple 1-line config change
- Works with current graphql-java version (no dependency upgrade)
- Chained loads get auto-dispatched within ~10ms
- Multiple loads queued around the same time batch together
Cons:
- Adds up to 10ms latency per chained load (the polling interval)
- Less optimal batching than graphql-java 25's native chaining (timer-based vs graph-aware)
- Existing
loadAndDispatchcalls still fire manual dispatches before the ticker
Restructure individual hot-path resolvers to avoid chaining where possible — e.g., pre-fetch all needed data at the same field level so loads batch naturally.
Pros: No dependency changes, targeted fixes
Cons: Labor-intensive, doesn't fix the systemic problem, may not be possible for all call patterns
| Phase | Action | Dependency | Risk |
|---|---|---|---|
| 1 | Sign up as early tester for SBN4 | Platform team | None |
| 2 | Expand PR #4023 to chain DataLoaders across the entire app (not just live prefetching) | graphql-java 25 beta | Medium — unsupported until SBN4 GA |
| 3 | Dev canary the full change; validate batching improvements via Atlas gRPC metrics | Canary infrastructure | Low |
| 4 | Remove loadAndDispatch methods from all DataLoader classes |
Phase 3 validation | Low — mechanical refactor |
| 5 | Adopt graphql-java 25 GA when SBN4 ships | SBN4 GA | None |
| DataLoader | loadAndDispatch References |
Downstream Service |
|---|---|---|
EvidenceLoader |
26 | ECL (evidence control layer) |
VideoCoreDataLoader |
10+ | oasis VideoCore gRPC |
ImageDataLoader |
4 | Images gRPC |
ECLHydratePageDataLoader |
3 | ECL hydration gRPC |
GameDataLoader |
5 | Games gRPC |
CollectionDataLoader |
4 | Collections gRPC |
LiveEventsDataLoader |
3 | Live events gRPC |
PersonDataLoader |
2 | Person gRPC |
| All others | ~60 combined | Various |