Date: 2026-03-14 Status: Proposal
The single highest-leverage addition to this project is an executable architecture guardrail system: a repo-native layer that turns the codebase's architectural rules, incident learnings, and review heuristics into machine-enforced checks that run on every change.
Not another product feature. Not another agent. Not another prompt.
The compounding move is to make the project able to defend its own architecture.
This repo already has unusually strong ingredients:
- a clear target architecture in guides/ARCHITECTURE.md
- substantial AI infrastructure in apps/core/lib/core/infrastructure/ai/agent.ex and apps/core/lib/core/infrastructure/ai/baml.ex
- automated PR review prompts in .agents/prompts/architecture-review.md and .agents/prompts/test-review.md
- a deep internal research/planning habit, especially around failures, migrations, and AI behavior
But the current system is still too dependent on:
- humans remembering rules
- LLM reviewers inferring patterns from prose
- incidents being fixed one by one after they escape
That is good operational maturity. It is not yet self-reinforcing enough.
The repo has already shown the exact failure mode:
- a real memory incident from
Dataloader.KVclosure context capture required a research document and then a prompt update rather than a permanent code-level defense - AI infrastructure is growing across chat, meeting notes, attachments, tracing, and evals, which increases the number of cross-cutting invariants that are easy to violate accidentally
- the architecture review workflow is comprehensive, but it is still fundamentally a text-guided reviewer, not an executable policy engine
The result is a familiar pattern:
- We discover a subtle architectural or performance bug.
- We write a strong document about it.
- We update prompts and expectations.
- We remain vulnerable to the same class of mistake in a different form.
That loop is too lossy for a codebase of this size.
Build a new internal capability: Executable Architecture Guardrails.
This should be a first-class system, not a scattered set of scripts. Its job is to codify and enforce rules such as:
- no cross-context calls from
Core.Contexts.* - no new business logic in legacy apps
- no external usage of internal nested context modules
- dataloader closure safety rules
- required public API boundaries for contexts
- AI/agent-layer constraints and approved orchestration boundaries
- multi-tenant safety checks on selected query and resolver patterns
- “incident-derived rules” that graduate from research docs into permanent enforcement
This is accretive in a way most feature work is not:
- every future PR gets safer
- every future engineer gets faster
- every future agent gets more reliable
- every discovered incident can become durable institutional memory
It upgrades the repo from “well-documented architecture” to “architecture with active immune system.”
That is a much stronger compounding asset than one more isolated product capability.
The right mental model is:
Prompt review for judgment. Executable guardrails for invariants.
The system should have three layers.
A structured source of truth for rules with fields like:
- rule id
- category
- severity
- rationale
- source documents
- detection strategy
- autofix availability
This converts the current prose-only rule set into something that can be inspected, tested, and extended deliberately.
A repo-local analyzer that runs against diffs and full code, using AST where possible and grep-pattern checks only where acceptable.
Examples:
- flag
Dataloader.KV.new(fn ... ctx ... end)when full Absinthe context is captured - flag references to
Core.Contexts.*.<NestedModule>outside the parent context - flag new production code under
apps/timeline,apps/betafolio, orapps/integrations - flag direct cross-context calls from context modules
- flag direct schema access from forbidden GraphQL layers
Any meaningful production bug, review miss, or architecture exception should have a standard path:
incident/research doc -> new guardrail rule -> regression fixture/test -> CI enforcement
That is the real payoff. The project gets better at not repeating its own mistakes.
There are other strong candidates:
- deeper AI meeting intelligence
- broader agent tool surfaces
- richer retrieval across internal docs
- more observability around LLM calls
All of those can be valuable. None of them compounds across the whole repo as aggressively as executable guardrails.
This repo is already large, multi-app, partially transitional, and architecturally opinionated. In that environment, preventing drift is more valuable than adding one more isolated capability.
The first version should stay narrow and high-signal.
Start with rules that are:
- already explicit in docs
- repeatedly reviewed by humans
- objectively detectable
- expensive when missed
Recommended v1:
- Legacy app freeze enforcement.
- Internal nested context module access ban.
- Cross-context call ban from contexts.
Dataloader.KVcontext-capture ban.- GraphQL direct-schema access ban in forbidden layers.
- New AI agent modules must use approved namespaces and suffixes.
Phase 1 should be advisory only:
- run in CI
- report findings
- tune false positives
Phase 2 should block on a small subset of high-confidence violations.
Phase 3 can add:
- auto-linked source docs in findings
- suggested fixes
- “rule added because of incident X” traceability
- a lightweight dashboard showing most-triggered rules and architectural drift trends
This addition is succeeding if, within a few weeks:
- repeated review comments turn into reusable rules
- known classes of mistakes are caught before merge
- architecture review comments become shorter and higher-value
- incident postmortems start producing enforcement artifacts, not just prose
- engineers trust the checks enough to move faster, not slower
- replacing human architecture review
- replacing tests
- turning the codebase into a rigid bureaucracy
- trying to statically prove everything
The goal is not maximal policing. The goal is high-leverage prevention of expensive, repeated mistakes.
If only one meaningful thing gets added next, it should be this.
The repo already has architecture, agents, prompts, evals, and research discipline. The missing layer is the one that turns all of that accumulated intelligence into enforced, compounding protection.
That is the smartest addition because it improves the quality of every other addition after it.