Skip to content

Instantly share code, notes, and snippets.

@KyleAMathews
Last active February 24, 2026 22:22
Show Gist options
  • Select an option

  • Save KyleAMathews/5564edebb0d2d0f2a209341fc0994a37 to your computer and use it in GitHub Desktop.

Select an option

Save KyleAMathews/5564edebb0d2d0f2a209341fc0994a37 to your computer and use it in GitHub Desktop.
skill-domain-discovery review: TanStack DB domain map and skill spec

essages themselves are descriptive enough to reconstruct the failure mechanism.

Changelog as failure mode source

The skill's instruction to extract "old pattern / new pattern / what changed" from migration guides applied well to changelogs too. Each changelog entry in TanStack DB describes a bug fix with enough detail to derive the wrong code pattern. Example: "Fix gcTime: Infinity causing immediate garbage collection instead of disabling GC" directly becomes a failure mode.


What Could Be Improved

1. Phase 1 reading volume is enormous for large libraries

TanStack DB has ~445 markdown docs and ~491 TypeScript source files. The skill says "read every narrative guide" and "scan API reference" — but for a library this size, that's a multi-hour autonomous phase even with parallelized reads.

Suggestion: Add a triage step between reading the README/quickstart and reading everything else. After the initial read, the agent should identify which packages/docs are core vs. peripheral and prioritize accordingly. For TanStack DB, the core is @tanstack/db + @tanstack/react-db + @tanstack/query-db-collection. The other 4 adapters are variations on the same pattern — reading one deeply and skimming the others would have been sufficient.

Suggested addition to Phase 1:

After reading README and quickstart, identify the core package(s) vs. adapter/integration packages. Read core packages exhaustively. For adapter packages, read one representative adapter deeply, then scan others for deviations from the pattern.

2. "One question per message" is too strict for confirming factual items

The skill mandates "ask exactly one question per message" during the interview. This works well for open-ended exploration questions, but it's unnecessarily slow for confirming factual items. When I had 3 gaps that were simple yes/no confirmations (e.g., "is the ready-state issue fixed now?"), sending them one at a time felt like wasted maintainer time.

Suggestion: Allow batching of 2-3 confirmation questions (yes/no, still relevant?, which is current?) while keeping open-ended exploration questions to one per message. The distinction: confirmations narrow down; explorations expand.

3. No guidance on AI-agent-specific failure modes

The skill focuses on developer failure modes (what a human gets wrong), but several of the highest-value findings were AI-agent-specific failure modes — mistakes that agents make but humans rarely would:

  • Hallucinating API signatures
  • Defaulting to JS filtering instead of query operators
  • Not knowing which adapter to use
  • Using object-spread instead of draft proxy

These are distinct from "developer confusion" patterns. The skill should explicitly prompt for AI-agent-specific failure modes during Phase 3.

Suggested addition to Phase 3c:

"If an AI coding agent were generating code for your library, what mistakes would it make that a human developer wouldn't? Think about: API hallucination, defaulting to language primitives instead of library features, missing the correct abstraction layer."

4. Composition discovery needs more structure

Phase 3d asks about composition with other libraries, but the questions are generic. For TanStack DB, the most important composition (Router integration) only came up because I asked a broad question and the maintainer volunteered it. The skill should push harder on composition discovery.

Suggestion: Add to Phase 2 — scan package.json peer dependencies and import statements across examples to identify which other libraries appear most frequently. Then ask targeted questions about each in Phase 3d.

5. The "validated" field is binary — needs a confidence scale

Every failure mode gets validated: true/false. But there's a meaningful difference between:

  • "Maintainer explicitly confirmed this is a real problem" (e.g., Immer-style update confusion)
  • "Maintainer said docs are comprehensive and didn't contradict this" (e.g., most source-extracted error patterns)
  • "I extracted this from source but never discussed it" (didn't come up)

Suggestion: Replace boolean validated with a confidence field: confidence: confirmed | inferred | unverified. "Confirmed" means the maintainer explicitly discussed it. "Inferred" means it was presented to the maintainer and not contradicted. "Unverified" means it was never discussed.

6. No guidance on handling "docs are comprehensive" responses

When I asked about failure modes the maintainer might know about beyond docs, the response was "the docs should be pretty comprehensive here." The skill doesn't have guidance for this — should you take it at face value, or probe further? In this case, probing with specific AI-agent-focused questions (Q9-Q11) produced the most valuable findings. The skill should note that "docs are comprehensive" is often true for human developers but not for AI agents.

7. Missing: version-specific failure mode decay

The skill extracts failure modes from changelogs (old bugs that were fixed), but doesn't clearly distinguish between "this was fixed and agents should NOT warn about it" vs. "this was fixed but agents trained on old code might still generate the old pattern." For TanStack DB, several changelog items (gcTime: Infinity, ready-state race conditions) are fixed — but the skill doesn't provide guidance on whether to include or exclude them.

Suggestion: Add a status field to failure modes: active | fixed-but-legacy-risk | fixed. "Active" means it's still a problem. "Fixed-but-legacy-risk" means it was fixed but agents trained on older code might still hit it. "Fixed" means it can be dropped.


Metrics

Metric Value
Domains produced 5
Failure modes (total) 33
Failure modes (CRITICAL) 11
Failure modes from docs/source 26
Failure modes from interview 7 (4 CRITICAL)
Gaps identified 10
Gaps resolved in interview 3
Gaps remaining 6 (+ 1 new from interview: Router integration)
Interview questions asked 12
Maintainer corrections to draft 0 (domain grouping confirmed as-is)
Composition opportunities 9

Verdict

The skill produces a genuinely useful artifact. The domain_map.yaml is structured enough to feed directly into skill generation, and the failure mode inventory — especially the maintainer-sourced items — captures knowledge that doesn't exist in any other form. The 4-phase structure (read → draft → interview → finalize) is well-designed: the autonomous phases build enough context that the interview is efficient and targeted rather than exploratory.

The biggest improvement opportunity is adding explicit AI-agent-specific failure mode discovery. For library skill generation, the #1 consumer of these artifacts is AI agents, and the mistakes agents make are systematically different from human developer mistakes. The skill should acknowledge this throughout.

Rating: 8/10 — Produces high-quality output with clear structure. The interview phase is the star. Main gaps: reading triage for large codebases, AI-agent-specific failure mode prompts, and confidence gradation for validated items.

TanStack DB — Skill Specification (Reviewed)

Library Overview

TanStack DB is a reactive client-side data store that provides normalized collections, sub-millisecond live queries via differential dataflow (d2ts), and instant optimistic mutations with automatic rollback. It supports multiple data sources (REST APIs via TanStack Query, sync engines like ElectricSQL/PowerSync/RxDB/TrailBase, and local storage) through a unified collection API with framework adapters for React, Vue, Svelte, Solid, and Angular.

Domain Table

Domain Skill name What it covers Failure modes Tier
Collection Setup & Schema tanstack-db/collection-setup createCollection, all 7 adapter option creators, CollectionConfig, StandardSchema integration, type transformations, collection lifecycle, localOnly→backend upgrade path 7 1
Live Query Construction tanstack-db/live-queries Query builder API (.from/.where/.join/.select/.groupBy/.having/.orderBy/.limit/.offset/.distinct/.findOne), all operators (comparison, logical, aggregate, string, math), derived collections, createLiveQueryCollection, $selected namespace, predicate push-down, incremental view maintenance 8 1
Framework Integration tanstack-db/framework-integration React (useLiveQuery, useLiveSuspenseQuery, useLiveInfiniteQuery, usePacedMutations), Vue (useLiveQuery composable), Svelte (useLiveQuery with runes), Solid (useLiveQuery with signals), Angular (injectLiveQuery), dependency arrays, Suspense 4 1
Mutations & Optimistic State tanstack-db/mutations-optimistic collection.insert/update/delete, Immer-style draft proxy, createOptimisticAction, createPacedMutations, createTransaction, transaction stacking, mutation merging, change tracking proxy, rollback, TanStack Pacer integration 8 1
Sync & Connectivity tanstack-db/sync-connectivity Sync modes (eager/on-demand/progressive), SyncConfig, Electric txid tracking, Query direct writes, PowerSync persistence, RxDB observables, TrailBase events, @tanstack/offline-transactions, leader election, subscribeChanges, loadSubsetOptions, collection options creator pattern 6 1

Failure Mode Inventory

Collection Setup & Schema (7 modes)

# Mistake Priority Source Validated
1 queryFn returning empty array deletes all collection data CRITICAL docs Yes
2 Not knowing which collection type to use for a given backend CRITICAL maintainer Yes
3 Using async schema validation HIGH source Yes
4 getKey returning undefined for some items HIGH source Yes
5 TInput not superset of TOutput with schema transforms HIGH docs Yes
6 Providing both explicit type parameter and schema MEDIUM docs Yes
7 React Native missing crypto.randomUUID polyfill HIGH docs Yes

Live Query Construction (8 modes)

# Mistake Priority Source Validated
1 Using === instead of eq() in where clauses CRITICAL source Yes
2 Filtering/transforming data in JS instead of using query operators CRITICAL maintainer Yes
3 Not using the full set of available query operators HIGH maintainer Yes
4 .distinct() without .select() HIGH source Yes
5 .having() without .groupBy() HIGH source Yes
6 .limit()/.offset() without .orderBy() HIGH source Yes
7 Join condition using operator other than eq() HIGH source Yes
8 Passing source directly instead of {alias: collection} MEDIUM source Yes

Framework Integration (4 modes)

# Mistake Priority Source Validated
1 Missing external values in useLiveQuery deps array CRITICAL docs Yes
2 Reading Solid signals outside the query function HIGH docs Yes
3 useLiveSuspenseQuery without Error Boundary HIGH docs Yes
4 Svelte props not wrapped in getter functions for deps MEDIUM docs Yes

Mutations & Optimistic State (8 modes)

# Mistake Priority Source Validated
1 Passing object to update() instead of mutating the draft CRITICAL maintainer Yes
2 Hallucinating mutation API signatures CRITICAL maintainer Yes
3 onMutate callback returning a Promise CRITICAL source Yes
4 insert/update/delete without handler or ambient transaction CRITICAL source Yes
5 .mutate() after transaction no longer pending HIGH source Yes
6 Attempting to change primary key via update HIGH source Yes
7 Inserting item with duplicate key HIGH source Yes
8 Not awaiting refetch after mutation in query collection handler HIGH docs Yes

Sync & Connectivity (6 modes)

# Mistake Priority Source Validated
1 Electric txid queried outside mutation transaction CRITICAL docs Yes
2 Not calling markReady() in custom sync implementation CRITICAL docs Yes
3 queryFn returning partial data without merging existing CRITICAL docs Yes
4 Race condition: subscribing after initial fetch loses changes HIGH docs Yes
5 write() called without begin() in sync implementation HIGH source Yes
6 Direct writes overridden by next query sync MEDIUM docs Yes

Total: 33 failure modes (11 CRITICAL, 17 HIGH, 5 MEDIUM) — all validated

Key Maintainer Insights (not in docs)

  1. Always prefer query operators over JS — Live queries are incrementally maintained via D2 differential dataflow. A .where(eq(...)) only recomputes the delta on data changes, while .filter() in JS re-runs from scratch. This applies even for trivial transformations. Every operator in the library is faster than the JS equivalent.

  2. The update API is Immer-stylecollection.update(id, (draft) => { draft.title = "new" }) not collection.update(id, { ...item, title: "new" }). This is the single most common mutation API mistake AI agents make.

  3. Agents hallucinate mutation APIs — The mutation surface has nuance (handler signatures like { transaction } with transaction.mutations[0].changes, ambient transaction context, createOptimisticAction vs createTransaction). Agents generate plausible-looking but wrong code.

  4. Collection type selection matters — Don't default to bare createCollection or localOnlyCollectionOptions. Each backend has a dedicated adapter (queryCollectionOptions, electricCollectionOptions, etc.) that handles sync, handlers, and utilities correctly.

  5. localOnly is a valid prototyping strategylocalOnlyCollectionOptions → real backend adapter is a clean upgrade path. The collection API is uniform enough to swap without changing query or component code.

  6. Offline is hard — but @tanstack/offline-transactions is the answer — Don't steer users toward offline unless they need it. When they do, @tanstack/offline-transactions is the recommended package (integrated into the transaction model). PowerSync/RxDB handle their own local persistence, which is a different concern from offline transaction queuing.

  7. TanStack Router composition is a pain point — The loading/prefetching pattern with collections is the most common integration struggle. This needs a dedicated composition skill.

  8. Transactions stack — Concurrent transactions build optimistic state on top of each other. Use TanStack Pacer for sequential execution when ordering matters.

Remaining Gaps (6 open)

  1. SSR behavior — How do collections behave in server-side rendering?
  2. Collection GC & disposal — When are collections garbage collected in route-based SPAs?
  3. Query performance limits — At what complexity/data size do live queries degrade?
  4. Shared query computation — Do multiple live queries on the same collection share D2 operators?
  5. Temporary ID mapping — Recommended pattern for mapping client IDs → server IDs during sync?
  6. TanStack Router integration — Specific patterns for collection loading/prefetching with Router

Recommended Skill File Structure

All 5 domains are Tier 1 skills — each covers substantial developer work with multiple distinct tasks and failure modes.

tanstack-db/
├── collection-setup/
│   ├── skill.md                # Core collection creation and configuration
│   └── references/
│       ├── adapter-configs.md  # All 7 adapter option shapes with examples
│       ├── schema-patterns.md  # StandardSchema integration patterns
│       └── lifecycle.md        # Collection status transitions
├── live-queries/
│   ├── skill.md                # Query builder API and expressions
│   └── references/
│       ├── operators.md        # All comparison, logical, aggregate, string, math operators
│       ├── join-patterns.md    # Join types and conditions
│       └── performance.md      # D2 internals, why queries > JS loops
├── framework-integration/
│   ├── skill.md                # Cross-framework hook patterns
│   └── references/
│       ├── react.md            # React hooks, Suspense, infinite queries
│       ├── vue.md              # Vue composable patterns
│       ├── svelte.md           # Svelte 5 runes patterns
│       ├── solid.md            # Solid signal patterns
│       └── angular.md          # Angular inject/signal patterns
├── mutations-optimistic/
│   ├── skill.md                # Mutation patterns, draft proxy, transaction lifecycle
│   └── references/
│       ├── transaction-api.md  # Transaction class, states, merging rules
│       ├── paced-mutations.md  # Debounce/throttle strategies, Pacer integration
│       └── draft-proxy.md      # Immer-style update API (critical for agents)
└── sync-connectivity/
    ├── skill.md                # Sync modes and adapter patterns
    └── references/
        ├── electric-sync.md    # ElectricSQL txid and shape patterns
        ├── query-sync.md       # TanStack Query direct writes and refetch
        ├── offline.md          # @tanstack/offline-transactions
        └── custom-adapter.md   # Collection options creator guide

Composition Opportunities

Other Library Interaction Composition Skill Needed Priority
TanStack Query Primary data fetching layer; queryCollectionOptions wraps QueryObserver tanstack-db/query-integration HIGH (built-in)
TanStack Router Route-based collection loading and prefetching tanstack-db/router-integration HIGH (major pain point)
ElectricSQL Real-time sync via ShapeStream; txid-based mutation tracking tanstack-db/electric-sync HIGH (built-in)
PowerSync SQLite offline persistence; diff-trigger change tracking tanstack-db/powersync-sync MEDIUM (built-in)
RxDB Observable-driven sync; RxJS subscription management tanstack-db/rxdb-sync MEDIUM (built-in)
TrailBase Event stream sync; cursor-based pagination tanstack-db/trailbase-sync MEDIUM (built-in)
Zod / Valibot / ArkType / Effect Schema validation via StandardSchema spec tanstack-db/schema-validation MEDIUM
TanStack Table Virtual table rendering of live query results Needs investigation LOW
TanStack Pacer Sequential transaction execution, debounced mutations tanstack-db/pacer-integration MEDIUM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment