| # | Where | Why it’s a problem | Suggested fix |
|---|---|---|---|
| R-1 | Federated Decision Authority → Team-Level vs. Cross-Team | “Product teams have full authority over architecture decisions that affect only their services.” Five lines later you require that any cross-team dependency triggers AAG involvement. In micro-service environments, almost every change creates at least incidental integration risk (shared observability, IAM scope, cost envelopes, etc.). You’ve implied a bright line that rarely exists. | Spell out minimal-impact criteria (e.g., “interface change limited to additive, backward-compatible API paths”; “no net-new infra cost”), or treat “default AAG light-review” as the safe path. |
| R-2 | Mandatory ADRs → “Teams cannot proceed … until ADR is published and reviewed.” | This turns ADRs from lightweight records into a gating approval workflow (contradicting your later “advisory” language). It also conflicts with the emergency-decision escape hatch (§Escalation). | Decide which you really want: (a) documentation first but non-blocking, or (b) approvals first. State it unambiguously and align the escalation language. |
| R-3 | AAG capacity assumptions | 2-3 h/wk × ~6 people ≈ a single engineer’s time. Yet the group must review cross-team and org-level decisions, run office hours, maintain metrics, and retros. Volume and latency targets (90 % ADR compliance) cannot be delivered with that capacity. | Either increase staffing (or funded rotation), narrow the scope (e.g., only review non-trivial decisions), or automate triage so the AAG focuses on <20 % of ADRs. |
| R-4 | Success criteria → “< 5 % decision escalation rate.” | A low escalation rate is not inherently healthy – it can mean decisions are rubber-stamped or that teams self-censor controversial work. | Track not just rate but outcome quality (e.g., rework, post-integration defects). Consider a target band (“10–20 % escalations with <N days cycle-time”) instead of “as low as possible”. |
| R-5 | Measurement & Review → leading vs. lagging | “Architecture Office Hours attendance” is treated as a leading indicator of success, but high attendance can also indicate confusion or friction. | Pair it with sentiment or resolution metrics (e.g., repeat attendance on the same topic, average wait for answers) to avoid the “vanity metric” trap. |
| R-6 | Root-cause analysis vs. chosen remedies | You diagnose “decision authority ambiguity” and “technical-standard inconsistency”, but most remedies are process or governance layers – not how authority gets clarified in everyday code/PR/release loops. | Show a direct trace: root cause → policy knob → expected behavioral change. If the knob is “publish & auto-test shared libraries”, bake that into policy alongside ADR governance. |
Created
May 24, 2025 23:48
-
-
Save lethain/f7a033653d70eb234819d6f3cbfd951b to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment