Skip to content

Instantly share code, notes, and snippets.

@so0k
Created March 2, 2026 05:55
Show Gist options
  • Select an option

  • Save so0k/afc7cc7d0c43bb58e69daf73b9371c1d to your computer and use it in GitHub Desktop.

Select an option

Save so0k/afc7cc7d0c43bb58e69daf73b9371c1d to your computer and use it in GitHub Desktop.
Decision Log: Advanced Memory Creation System (Tier 2) — specledger #49 follow-up

Decision Log: Advanced Memory Creation System (Tier 2)

Date: 2026-03-02 Participants: @vincentdesmet + Claude Opus 4.6 analysis Baseline commit: fe8519f on main Context: Extracted from #49 analysis — Tier 2 (LLM-required features) Referenced spec: specledger/598-sdd-workflow-streamline/spec.md (4-layer CLI design)


D-MEM-1: Adoption Strategy — Layer Responsibility Mapping

Question: Which adoption strategy for the Tier 2 memory system best fits specledger's 4-layer architecture?

Options Presented:

# Strategy Description
A Strict Layer Separation L1 stores/retrieves pre-computed data only. L2 AI commands do ALL LLM work. L0 hooks trigger capture. L3 skills inject knowledge context. L1 stays "no AI needed".
B L1 Launcher to L2 sl session summarize / sl memory synthesize are launcher commands (like sl revise) that invoke AI agent sessions. Data in L1, LLM in L2.
C Edge Function Processing L0 hook triggers capture → Supabase Edge Function does LLM processing server-side → L1 reads pre-computed results.
D Hybrid (L2 explicit + L0 auto) First run: L2 AI command bootstraps knowledge bank. Thereafter: L0 hook auto-updates incrementally.

Decision: Option A — Strict Layer Separation

Reasoning: Specledger's hosting layer remains LLM-free and leverages fully customer BYO LLM / Agent shell. The memory-backed features are pure cloud storage for cross-team workflow improvements. This means:

  • L1 (sl CLI) never calls an LLM — stays a pure Go binary
  • L2 (AI commands invoked via agent shell) owns all LLM processing
  • L3 (skills) passively inject pre-computed knowledge into agent context
  • Customer brings their own LLM through their agent shell (Claude Code, etc.)

Layer Assignment:

Feature L0 Hook L1 CLI L2 AI Command L3 Skill
Session summaries sl session list displays /specledger.memory (summarize mode)
Auto-tagging sl session list --tag /specledger.memory (tag mode)
Recurring issues sl memory show /specledger.memory (patterns mode) sl-memory injects
Knowledge bank synthesis sl memory show /specledger.memory (synthesize mode) sl-memory injects
Context injection SessionStart hook sl-memory auto-loads

D-MEM-2: Knowledge Bank Storage Format

Question: What storage format for the memory/knowledge bank?

Options Presented:

# Format Description
A Markdown files Memory bank as markdown files in .specledger/memory/cache/. Gitignored, sits next to existing constitution.md. L3 skills can inject directly.
B Structured JSON JSON documents in Supabase Storage. Machine-queryable, supports cross-project aggregation.
C Both — Markdown local + JSON cloud Markdown committed to repo for local injection, JSON mirror in Supabase for cross-team queries.

Decision: Option A — Markdown files (as local rendering format)

Reasoning: Placed at .specledger/memory/cache/ — next to the existing .specledger/memory/constitution.md. The .specledger/ directory is the project config/tooling root; specledger/<spec>/ is reserved for feature-scoped design artifacts (spec.md, plan.md, tasks.md). A cloud-materialized cache is project-level tooling, not a feature artifact. See D-MEM-4 for the full picture — markdown files are a gitignored local cache, not the source of truth.


D-MEM-3: AI Command Shape

Question: Should /specledger.memory be a single command with subactions, or separate commands?

Options Presented:

# Shape Description
A Single command with modes /specledger.memory handles summarize, tag, patterns, synthesize as workflow stages within one invocation.
B Separate AI commands /specledger.memory-summarize, /specledger.memory-patterns, /specledger.memory-synthesize as distinct commands.
C Single + explicit subcommand arg /specledger.memory summarize, /specledger.memory patterns etc. One skill file, explicit subaction argument.

Decision: Option A — Single command with modes

Reasoning: One entry point for discovery. The command determines which mode(s) to run based on context (e.g., run summarize first if no summaries exist, then patterns, then synthesize). Follows the pattern of /specledger.implement which handles multiple stages internally.


D-MEM-4: Cross-Branch Sync Model

Question: How should the memory bank stay consistent across branches?

Options Presented:

# Model Description
A Branch-scoped, merge-forward Memory files live in git per feature branch. On merge to main, memory merges with code. Git-native.
B Cloud-indexed, local-rendered Supabase is source of truth. Local markdown files are a gitignored materialized cache rendered from cloud data.
C Local-first, cloud-promoted Feature memory local-only. On promotion, pushed to cloud. Two-tier model.

Decision: Option B — Cloud-indexed, local-rendered

Reasoning: Memory cache lives at .specledger/memory/cache/ — next to the existing .specledger/memory/constitution.md, following the convention that .specledger/ is the project config/tooling root. This is a gitignored materialized cache — like node_modules/ from a registry. Cloud (Supabase memory_entries table) is the single source of truth. This enables:

  • Cross-branch visibility (sl memory show --branch feat-x)
  • Cross-team access (shared cloud store)
  • Offline L3 injection from local cache
  • No git merge conflicts on memory files
  • Cache refreshed by L2 command or sl memory pull (L1)

Architecture:

.specledger/                         ← project config/tooling root
├── memory/
│   ├── constitution.md              ← existing, git-tracked
│   └── cache/                       ← NEW, .gitignore'd
│       ├── knowledge.md             ← rendered from cloud (promoted entries)
│       └── patterns.md              ← rendered from cloud (observed entries)
├── templates/                       ← existing
└── scripts/                         ← existing

Supabase (source of truth)
├── memory_entries table
│   ├── id, project_id, branch, scope (feature|project)
│   ├── content, score, status (observed|promoted)
│   ├── recurrence_count, impact_score, specificity_score
│   └── created_at, updated_at

Data flow:
  L2 /specledger.memory ──► writes to Supabase ──► renders .specledger/memory/cache/
  L3 sl-memory skill ──► reads .specledger/memory/cache/ ──► injects into agent context
  L1 sl memory show ──► queries Supabase directly (cross-branch capable)
  L1 sl memory pull ──► refreshes .specledger/memory/cache/ from cloud

D-MEM-5: Agent Scoring Criteria for Learning Entries

Question: What scoring dimensions should the agent use to evaluate learning entries?

Options Presented:

# Model Dimensions
A 3-axis Recurrence + Impact + Specificity
B 4-axis + freshness Recurrence + Impact + Specificity + time-weighted freshness decay
C 4-axis + transferability Recurrence + Impact + Specificity + cross-project transferability
D Full 5-axis All five dimensions

Decision: Option A — 3-axis scoring (Recurrence + Impact + Specificity)

Scoring Rubric (each axis 1-10, composite = average):

Recurrence (R)

How often does this pattern/learning appear across sessions?

Score Criteria
1-3 Appeared once, may be situational
4-6 Appeared in 2-3 sessions, emerging pattern
7-10 Appeared in 4+ sessions, confirmed pattern

Impact (I)

How much time, effort, or debugging does this learning save when applied?

Score Criteria
1-3 Minor convenience, saves seconds
4-6 Moderate — saves minutes, avoids a known pitfall
7-10 Critical — prevents hours of debugging, blocks progress without it

Specificity (S)

How actionable and concrete is the learning? (filters out vague platitudes)

Score Criteria
1-3 Vague ("be careful with async code")
4-6 Directional ("this API returns paginated results, check for next_page")
7-10 Precise and actionable ("set pool_mode=transaction in pgbouncer.ini when using prepared statements with Supabase")

Composite Score

composite = (R + I + S) / 3

Promotion Threshold

  • Score >= 7.0 → auto-promoted to knowledge.md (project-level)
  • Score 4.0-6.9 → stays in patterns.md (observed, candidate for promotion)
  • Score < 4.0 → discarded (too vague, too situational, or low impact)

Reasoning: Three axes are sufficient to filter signal from noise without over-engineering the prompt. Freshness and transferability can be added later if needed. The threshold-based auto-promote model keeps the workflow automated within L2 — no human approval gate.


D-MEM-6: Promotion Model

Question: What promotion model for moving learnings from session-level to project knowledge bank?

Options Presented:

# Model Description
A Threshold-based auto-promote Learning exceeds composite score >= 7.0 → auto-promoted to knowledge.md. Fully automated within L2.
B N-strike promotion Learning must appear in N separate sessions (e.g., 3) before promotion. Proves recurrence through repetition.
C Tiered: observe → candidate → promoted 3-tier lifecycle with explicit promotion and justification. Most auditable.
D Human-in-the-loop Agent proposes, human approves via sl memory promote --approve. Maximum control.

Decision: Option A — Threshold-based auto-promote

Reasoning: Already present in the original #49 design. Keeps the workflow fully automated within the L2 AI command — no human bottleneck. The 3-axis scoring provides sufficient quality filtering. Entries below threshold are retained in patterns.md for future re-evaluation (score may increase as recurrence grows across sessions).


Summary

ID Decision Choice
D-MEM-1 Adoption strategy Strict Layer Separation — L1 LLM-free, L2 owns all LLM processing
D-MEM-2 Storage format Markdown files (as local rendering format)
D-MEM-3 Command shape Single /specledger.memory command with internal modes
D-MEM-4 Sync model Cloud-indexed (Supabase), local-rendered (gitignored .md cache)
D-MEM-5 Scoring criteria 3-axis: Recurrence + Impact + Specificity (composite avg, threshold >= 7.0)
D-MEM-6 Promotion model Threshold-based auto-promote (score >= 7.0 → knowledge.md)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment