devnacho/2026-03-14-executable-architecture-guardrails.md

## 2026-03-14-executable-architecture-guardrails.md

      
    Raw
  

              2026-03-14-executable-architecture-guardrails.md
            
          
    Executable Architecture Guardrails

Date: 2026-03-14
Status: Proposal
Thesis

The single highest-leverage addition to this project is an executable architecture guardrail system: a repo-native layer that turns the codebase's architectural rules, incident learnings, and review heuristics into machine-enforced checks that run on every change.
Not another product feature. Not another agent. Not another prompt.
The compounding move is to make the project able to defend its own architecture.
Why This Is The Smartest Next Addition

This repo already has unusually strong ingredients:

a clear target architecture in guides/ARCHITECTURE.md
substantial AI infrastructure in apps/core/lib/core/infrastructure/ai/agent.ex and apps/core/lib/core/infrastructure/ai/baml.ex
automated PR review prompts in .agents/prompts/architecture-review.md and .agents/prompts/test-review.md
a deep internal research/planning habit, especially around failures, migrations, and AI behavior

But the current system is still too dependent on:

humans remembering rules
LLM reviewers inferring patterns from prose
incidents being fixed one by one after they escape

That is good operational maturity. It is not yet self-reinforcing enough.
The Gap

The repo has already shown the exact failure mode:

a real memory incident from Dataloader.KV closure context capture required a research document and then a prompt update rather than a permanent code-level defense
AI infrastructure is growing across chat, meeting notes, attachments, tracing, and evals, which increases the number of cross-cutting invariants that are easy to violate accidentally
the architecture review workflow is comprehensive, but it is still fundamentally a text-guided reviewer, not an executable policy engine

The result is a familiar pattern:

We discover a subtle architectural or performance bug.
We write a strong document about it.
We update prompts and expectations.
We remain vulnerable to the same class of mistake in a different form.

That loop is too lossy for a codebase of this size.
Proposal

Build a new internal capability: Executable Architecture Guardrails.
This should be a first-class system, not a scattered set of scripts. Its job is to codify and enforce rules such as:

no cross-context calls from Core.Contexts.*
no new business logic in legacy apps
no external usage of internal nested context modules
dataloader closure safety rules
required public API boundaries for contexts
AI/agent-layer constraints and approved orchestration boundaries
multi-tenant safety checks on selected query and resolver patterns
“incident-derived rules” that graduate from research docs into permanent enforcement

What Makes It Radically Useful

This is accretive in a way most feature work is not:

every future PR gets safer
every future engineer gets faster
every future agent gets more reliable
every discovered incident can become durable institutional memory

It upgrades the repo from “well-documented architecture” to “architecture with active immune system.”
That is a much stronger compounding asset than one more isolated product capability.
Product Shape

The right mental model is:
Prompt review for judgment. Executable guardrails for invariants.
The system should have three layers.
1. Rule Registry

A structured source of truth for rules with fields like:

rule id
category
severity
rationale
source documents
detection strategy
autofix availability

This converts the current prose-only rule set into something that can be inspected, tested, and extended deliberately.
2. Static Check Engine

A repo-local analyzer that runs against diffs and full code, using AST where possible and grep-pattern checks only where acceptable.
Examples:

flag Dataloader.KV.new(fn ... ctx ... end) when full Absinthe context is captured
flag references to Core.Contexts.*.<NestedModule> outside the parent context
flag new production code under apps/timeline, apps/betafolio, or apps/integrations
flag direct cross-context calls from context modules
flag direct schema access from forbidden GraphQL layers

3. Incident Promotion Workflow

Any meaningful production bug, review miss, or architecture exception should have a standard path:
incident/research doc -> new guardrail rule -> regression fixture/test -> CI enforcement
That is the real payoff. The project gets better at not repeating its own mistakes.
Why This Beats Other Plausible Additions

There are other strong candidates:

deeper AI meeting intelligence
broader agent tool surfaces
richer retrieval across internal docs
more observability around LLM calls

All of those can be valuable. None of them compounds across the whole repo as aggressively as executable guardrails.
This repo is already large, multi-app, partially transitional, and architecturally opinionated. In that environment, preventing drift is more valuable than adding one more isolated capability.
Initial Rule Set

The first version should stay narrow and high-signal.
Start with rules that are:

already explicit in docs
repeatedly reviewed by humans
objectively detectable
expensive when missed

Recommended v1:

Legacy app freeze enforcement.
Internal nested context module access ban.
Cross-context call ban from contexts.
Dataloader.KV context-capture ban.
GraphQL direct-schema access ban in forbidden layers.
New AI agent modules must use approved namespaces and suffixes.

Rollout

Phase 1 should be advisory only:

run in CI
report findings
tune false positives

Phase 2 should block on a small subset of high-confidence violations.
Phase 3 can add:

auto-linked source docs in findings
suggested fixes
“rule added because of incident X” traceability
a lightweight dashboard showing most-triggered rules and architectural drift trends

Success Criteria

This addition is succeeding if, within a few weeks:

repeated review comments turn into reusable rules
known classes of mistakes are caught before merge
architecture review comments become shorter and higher-value
incident postmortems start producing enforcement artifacts, not just prose
engineers trust the checks enough to move faster, not slower

Non-Goals


replacing human architecture review
replacing tests
turning the codebase into a rigid bureaucracy
trying to statically prove everything

The goal is not maximal policing. The goal is high-leverage prevention of expensive, repeated mistakes.
Recommendation

If only one meaningful thing gets added next, it should be this.
The repo already has architecture, agents, prompts, evals, and research discipline. The missing layer is the one that turns all of that accumulated intelligence into enforced, compounding protection.
That is the smartest addition because it improves the quality of every other addition after it.
No results found