jeremylongshore/059-AA-AUDT-appaudit-devops-playbook.md

## 059-AA-AUDT-appaudit-devops-playbook.md

      
    Raw
  

              059-AA-AUDT-appaudit-devops-playbook.md
            
          
    IntentCAD (cad-dxf-agent): Operator-Grade System Analysis

For: DevOps Engineer
Generated: 2026-03-09
Version: v0.9.0 (30 epics shipped)
1. Executive Summary

Business Purpose

IntentCAD is a Drawing Intelligence Platform for AEC professionals. Users upload architectural drawings (DXF or PDF), describe what they need in plain English — an edit, a compliance check, a quantity takeoff, a health report — and the platform classifies intent on two axes, selects the right processing pipeline, and delivers structured results. Original files are never modified — every save produces a new file.
The system has shipped 30 epics across 9 phases, evolving from a local-first DXF editor into a multi-capability platform with compliance validation, health reports, quantity takeoff, drawing summaries, RFI generation, zone detection, revision comparison, agent mode, and user accounts with persistent workspaces. It ships as both a PySide6 desktop app (Windows/Linux) and a React + FastAPI web app deployed on Google Cloud (Firebase Hosting + Cloud Run). The LLM backend is Gemini via Vertex AI, with a mock provider for CI determinism.
The core architectural invariant: the LLM never touches DXF directly. It returns structured JSON operations (13 op types) which are validated against protected-layer rules before a deterministic edit engine applies them. This design eliminates a class of LLM hallucination risks at the architecture level.
Current risk profile: the system is well-tested (4,494 tests across 10 tiers, 65% coverage threshold) with green CI, automated deploys via WIF, and comprehensive security scanning. Primary operational risks are single-region deployment and the external ODA File Converter dependency for DWG support.
Operational Status Matrix


Environment
Status
Uptime Target
Release Cadence


Production (Web)
Active
Best-effort (Cloud Run default SLA)
Merge-to-main auto-deploy


Desktop
Builds available
N/A (local)
Tag-triggered (v* tags)


CI
Green (main)
N/A
Every push/PR


Staging
None (direct-to-prod)
N/A
N/A


Technology Stack


Category
Technology
Version
Purpose


Language
Python
3.11 / 3.12
Core pipeline, backend


DXF Engine
ezdxf
>=1.3.0
DXF read/write/entity manipulation


Data Models
Pydantic
>=2.0
Schema validation, serialization


LLM
Gemini (Vertex AI)
gemini-2.5-flash
Edit planning, vision, agent tool-use


Backend
FastAPI + Uvicorn
>=0.115.0
REST API for web frontend


Frontend
React 18 + Vite
18.3.1 / 6.0.5
SPA interface


DXF Viewer
dxf-viewer + Three.js
1.0.46 / 0.183.2
WebGL drawing preview


Auth
Firebase Authentication
11.0.0
Google Sign-In


Hosting
Firebase Hosting
—
Static SPA delivery


Compute
Cloud Run
—
Containerized backend (8Gi/4CPU)


Storage
GCS + Firestore
—
Document persistence, user profiles, tenants


Registry
Artifact Registry
—
Docker images (us-central1)


Tracing
OpenTelemetry → Cloud Trace
>=1.21
Pipeline span instrumentation


Desktop UI
PySide6
>=6.6
Qt-based desktop shell


CI/CD
GitHub Actions
—
Lint, test, deploy (WIF auth)


Linting
Ruff
>=0.5
Lint + format


Type Check
Mypy
>=1.10
Static type analysis


Security
Bandit + pip-audit
—
SAST + dependency audit


Build
Hatchling + PyInstaller
—
Package + desktop executable


2. System Architecture

Architecture Diagram

                         ┌─────────────────────────┐
                         │   Firebase Hosting       │
                         │   (React SPA)            │
                         │   cad-dxf-agent.web.app  │
                         └────────┬────────────────┘
                                  │ /api/* rewrite
                                  ▼
                         ┌─────────────────────────┐
                         │   Cloud Run              │
                         │   cad-dxf-web            │
                         │   FastAPI (8Gi/4CPU)     │
                         │   us-central1            │
                         └────────┬────────────────┘
                                  │
          ┌───────────────────────┼───────────────────────┐
          ▼                       ▼                       ▼
┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│  Pipeline Core   │  │   Vertex AI      │  │  Firebase/GCP    │
│  (ezdxf, 41      │  │   Gemini API     │  │  Auth, Firestore │
│   core modules,  │  │   (WIF auth)     │  │  GCS (documents) │
│   validators,    │  │                  │  │  Cloud Trace     │
│   edit engine)   │  │                  │  │                  │
└──────────────────┘  └──────────────────┘  └──────────────────┘

Pipeline Flow (Two-Axis Classification):
  User Prompt → ObjectiveClassifier (RequestClass × ObjectiveTag)
              → StrategyRegistry (maps to StagePipelineDefinition)
              → StageExecutor (ordered stages: deterministic + LLM)
              → ResponseBuilder (PlatformResponse envelope)

  Edit Pipeline:
    Planner(Gemini) → ChangeSet → Validator → Preview → EditEngine → Save-As DXF
                                                                         │
                                                               RevisionNotes (deterministic)

  Analysis Pipeline (compliance, health, takeoff, summary, RFI, zones):
    Deterministic extractors → structured results (no edit flow)

  Agent Mode (complex requests):
    Prompt + context + tools → Gemini → tool calls → ToolExecutor
      → results fed back → next iteration (max 10 turns)
      → final ChangeSet from accumulated tool calls

Desktop variant:
  PySide6 UI → same pipeline → local file I/O

Failure Domains


Domain
Impact
Mitigation


Gemini API down
No edit planning (read/compare/analysis still work)
Mock provider fallback in CI; timeout + retry with backoff


Cloud Run cold start
5-15s latency spike
8Gi memory, min-instances=0 (cost tradeoff)


Firebase Hosting
Frontend unavailable
CDN-backed, rarely fails


ODA binary missing
DWG uploads return 422
DXF and PDF uploads unaffected; size-validated download in CI


Firestore down
User profiles/tenant creation fails
Cached in-process (5min/10min TTL); existing sessions unaffected


GCS down
Document persistence fails
Session-local copies still work; uploads/downloads degrade


3. Directory Analysis

Project Structure

cad-dxf-agent/
├── src/cad_dxf_agent/          # Core Python package (124 .py files, 23k LOC)
│   ├── models/                 # 30 Pydantic schemas (cad, ops, config, zone, compliance, etc.)
│   ├── core/                   # 41 modules — DXF I/O, validation, editing, analysis
│   │   └── comparison/         # Revision diff engine (alignment, matching, changelog, bundle)
│   ├── llm/                    # 22 modules — intent classification, planning, agent loop
│   │   └── stage_handlers/     # Stage pipeline handlers (analyze, comply, health, summarize, etc.)
│   ├── cli/                    # cad-revision CLI (diff/align/bundle/explain)
│   ├── ui/                     # PySide6 desktop GUI
│   ├── settings.py             # Env-based configuration (all CAD_* prefixed)
│   ├── otel.py                 # OpenTelemetry bootstrap (off by default)
│   └── app.py                  # Desktop entry point
├── web/
│   ├── backend/                # FastAPI on Cloud Run
│   │   ├── main.py             # ~2850 lines — all API routes (20+ endpoints)
│   │   ├── api_v1.py           # /api/v1 router
│   │   ├── auth.py             # Firebase token validation
│   │   ├── session.py          # SessionManager for ephemeral work
│   │   ├── Dockerfile          # Production image (Python 3.12-slim + ODA)
│   │   └── requirements.txt    # Backend-specific deps
│   ├── frontend/               # React 18 + Vite SPA
│   │   ├── src/pages/          # Upload, Editor, Compare, RevisionWizard, Documents
│   │   ├── src/components/     # Reusable UI components
│   │   └── package.json        # Frontend deps
│   ├── firebase.json           # Hosting config + /api/** → Cloud Run rewrite
│   ├── firestore.rules         # Client reads/writes denied (server-side only)
│   └── .firebaserc             # Project: cad-dxf-agent
├── proxy/                      # Cloud Run proxy for desktop licensing
│   ├── main.py                 # FastAPI, rate-limited Gemini forwarder
│   └── Dockerfile              # Minimal image
├── tests/                      # 224 test modules, 4,494 tests across 10 tiers
│   ├── unit/                   # ~3,618 tests — schemas, validators, reader, writer, engine, etc.
│   ├── integration/            # ~102 tests — full pipeline, agent loop (ScriptedAgentProvider)
│   ├── web/                    # ~418 tests — FastAPI TestClient endpoint tests
│   ├── eval/                   # ~238 tests — intent classification scorecard
│   ├── live/                   # ~42 tests — real Gemini API tests (WIF in CI)
│   ├── e2e/                    # ~33 tests — end-to-end with real DXF files
│   ├── benchmark/              # ~19 tests — pytest-benchmark micro-benchmarks
│   ├── gui/                    # ~10 tests — PySide6 tests (QT_QPA_PLATFORM=offscreen)
│   ├── property/               # ~7 tests — fuzz/property tests (randomized, bounded)
│   ├── smoke/                  # ~7 tests — end-to-end mock pipeline
│   ├── fixtures/               # DXF zoo, revision cases, trajectories, prompt bank
│   └── helpers/                # DXF factory, changeset factory, scripted provider
├── scripts/                    # Build, smoke test, eval runner, fixture downloads
├── 000-docs/                   # 64 architectural/planning documents
├── .github/workflows/          # 8 CI/CD workflows
├── Makefile                    # 67-line task runner
├── pyproject.toml              # Build config, tool settings, dep groups
└── .pre-commit-config.yaml     # Ruff, trailing whitespace, .env block, main protection

Codebase Metrics


Language
Files
Code Lines
Comments
Blanks


Python
360
77,795
3,897
16,383


JSON
44
9,726
0
8


JSX
24
4,190
220
320


JavaScript
24
3,429
650
717


CSS
6
1,807
62
301


Other
95
1,730
11,533
3,991


Total
553
98,677
16,362
21,720


4. Operational Reference

Deployment Workflows

Local Development


Prerequisites: Python 3.11+, Node.js 22+, gcloud CLI
Setup:
# Python backend
pip install -e ".[dev]"
pre-commit install
gcloud auth application-default login  # One-time GCP auth

# Create .env (gitignored)
echo 'CAD_LLM_PROVIDER=gemini' > .env
echo 'CAD_GCP_PROJECT=cad-dxf-agent' >> .env

# Frontend
cd web/frontend && npm ci

Run:
# Backend on :8322
CAD_WEB_DEV_MODE=1 uvicorn web.backend.main:app --port 8322

# Frontend on :3000
cd web/frontend && npm run dev

Verification: make check (lint → format → typecheck → test → smoke)

Production Deployment

Normal path (automated): Merge PR to main touching web/** or src/** → GitHub Actions deploy-web.yml fires → builds Docker image → pushes to Artifact Registry → deploys Cloud Run → deploys Firebase Hosting. No manual steps.
Pre-flight checklist:

 All CI checks green on PR
 make check passes locally
 PR reviewed and approved
 No secrets in diff

Manual deploy (emergency only):
# ALWAYS specify --project (local gcloud may point elsewhere)
cd web/frontend && npm run build
firebase deploy --only hosting --project cad-dxf-agent

gcloud run deploy cad-dxf-web \
  --source . --dockerfile web/backend/Dockerfile \
  --region us-central1 --project cad-dxf-agent \
  --allow-unauthenticated --memory 8Gi --cpu 4 --timeout 600 \
  --service-account cad-dxf-web-run@cad-dxf-agent.iam.gserviceaccount.com \
  --set-env-vars CAD_LLM_PROVIDER=gemini,CAD_GCP_PROJECT=cad-dxf-agent,OTEL_ENABLED=1,OTEL_EXPORTER=gcp-trace
Do NOT use: gcloud builds submit --config cloudbuild.yaml — $SHORT_SHA is only set by triggers, not manual submits.
Rollback protocol:
# List recent revisions
gcloud run revisions list --service cad-dxf-web --region us-central1 --project cad-dxf-agent

# Route traffic to previous revision
gcloud run services update-traffic cad-dxf-web \
  --to-revisions=PREVIOUS_REVISION=100 \
  --region us-central1 --project cad-dxf-agent
Monitoring & Alerting


Cloud Trace: All pipeline stages emit OTel spans (cad.load_dxf, cad.run_planner, cad.validate, cad.build_context, etc.). Enabled via OTEL_ENABLED=1 + OTEL_EXPORTER=gcp-trace.
Cloud Run Logs: gcloud run services logs read cad-dxf-web --region us-central1 --project cad-dxf-agent
CI Status: gh run list --workflow=ci.yml and gh run list --workflow=deploy-web.yml
SLIs: No formal SLOs defined yet. Cloud Run provides built-in request latency, error rate, and instance count metrics.
Dashboards: GCP Console → Cloud Run → cad-dxf-web service page (built-in metrics)
On-call: No rotation — single-developer project.

Incident Response


Severity
Definition
Response
Playbook


P0
Web app completely down
Immediate — check Cloud Run status, rollback if deploy broke it
gcloud run revisions list → route traffic to last-good


P1
Gemini API failures (edit planning broken)
15 min — check Vertex AI status page, verify ADC credentials
Read-only features still work; users see "planning unavailable"


P2
ODA converter missing (DWG uploads 422)
Next business day — rebuild with ODA .deb from GCS
DXF and PDF uploads unaffected


P3
Test failures on main
Same day — fix or revert the breaking commit
gh run list --workflow=ci.yml → investigate


5. Security & Access

IAM


Role
Purpose
Permissions
Where


cad-dxf-web-run SA
Cloud Run runtime
Vertex AI API, Cloud Trace, GCS, Firestore
GCP IAM


WIF (GitHub Actions)
CI/CD deploy
Cloud Run deploy, Artifact Registry push, Firebase deploy, GCS read
Federated via WIF_PROVIDER / WIF_SERVICE_ACCOUNT (GitHub vars, not secrets)


Firebase Admin SDK
Token validation + Firestore
Firebase Auth read, Firestore read/write
Initialized in backend startup


Secrets Management


No stored secrets: WIF provides tokenless authentication from GitHub Actions to GCP. No API keys, service account JSON files, or secrets in GitHub.
Firebase API keys: Public-safe client config (hardcoded in deploy-web.yml). These are designed to be public per Firebase documentation.
Local dev: gcloud auth application-default login provides ADC credentials. .env file is gitignored.
Break-glass: If WIF breaks, manual deploy uses developer's own gcloud auth credentials with --project cad-dxf-agent.

Pre-commit Security Gates


detect-private-key: Blocks commits containing private keys
forbid-env-files: Blocks .env file commits
no-commit-to-branch: Prevents direct commits to main
check-added-large-files: Blocks files >1MB (catches accidental binary commits)
bandit: Python SAST on every CI run
pip-audit: Dependency vulnerability scan on every CI run

6. Cost & Performance

Monthly Costs (estimated)


Cloud Run: ~$5-20/mo (low traffic, scale-to-zero, 8Gi/4CPU per request)
Vertex AI (Gemini): ~$10-50/mo (depends on request volume; gemini-2.5-flash pricing)
Firebase Hosting: Free tier (SPA CDN)
Firebase Auth: Free tier (<50k MAU)
Firestore: Free tier (user profiles, tenants, allowlist)
GCS: ~$1-5/mo (document storage)
Artifact Registry: ~$1/mo (container storage)
Cloud Trace: Free tier (first 5M spans/mo)
Total: ~$20-80/mo at current usage

Performance Baseline


DXF load: <100ms for 200-entity drawings, ~500ms for 1000-entity (benchmarked in tests/benchmark/)
Gemini planning: 2-8s per edit prompt (network + inference)
Agent mode: 5-30s (multi-turn, up to 10 iterations)
Validation: <1ms per operation
Edit engine: <10ms per changeset application
Cloud Run cold start: 5-15s (Python image + ODA libraries)
Web API P95: ~3-10s end-to-end (dominated by Gemini latency)

7. Current State Assessment

What's Working


Comprehensive CI: Lint (ruff), format, typecheck (mypy), 4,494 tests across 10 tiers, security scans — all automated on push/PR
Automated deploys: Merge to main → GitHub Actions deploys both frontend + backend via WIF. Zero manual steps.
Multi-capability platform: Edit, compliance, health, takeoff, summary, RFI, zone detection, revision comparison, agent mode — all production-ready
Two-axis intent classification: Every prompt classified by RequestClass (what) × ObjectiveTag (why), routed to the right pipeline
User accounts: Firebase Auth (Google Sign-In), Firestore tenants/profiles, GCS document persistence with work progress
Safety architecture: LLM never touches DXF directly; protected layers enforced at validator + ToolExecutor; deterministic revision notes; save-as workflow
Agent mode: Iterative tool-use loop (20+ tools, max 10 turns) for complex multi-step requests
Modern tooling: 30 Pydantic schemas, Ruff linting, syrupy snapshots, pytest-benchmark, OpenTelemetry tracing
Strong documentation: 64 docs covering architecture decisions, specs, audit reports, and epic AARs
WIF authentication: No secrets stored anywhere — tokenless GCP access from CI

Areas Needing Attention


No staging environment: Production deploys go direct-to-prod. A staging Cloud Run service would catch deploy issues before users see them.
Single-region: Cloud Run only in us-central1. No multi-region failover.
No CODEOWNERS: No automated review assignment for critical paths.
Proxy service: Not deployed via CI — ad-hoc manual deploys. No monitoring.
ODA dependency: External binary downloaded from GCS. If the bucket or file is lost, DWG support breaks.
Desktop build: Windows-only PyInstaller builds. Linux desktop builds not automated.
No dependency pinning: pyproject.toml uses >= ranges. No lock file for reproducible builds.

8. Quick Reference

Command Map


Capability
Command
Notes


Install + setup
pip install -e ".[dev]" && pre-commit install
Editable install with all dev deps


All quality checks
make check
lint → format → typecheck → test → smoke


Lint only
make lint
ruff check src/ tests/


Format only
make format
ruff format src/ tests/


Type check
make typecheck
mypy src/


All tests
.venv/bin/python -m pytest -v
System pytest may lack ezdxf


Unit tests
make test-unit
~3,618 tests


Integration tests
make test-integration
~102 tests


Web API tests
make test-web
~418 tests, FastAPI TestClient


E2E tests
make test-e2e
~33 tests, real DXF files


Live Gemini tests
make test-live
~42 tests, requires ADC


Eval scorecard
make scorecard
~238 tests, mock mode


Coverage report
make test-cov
Threshold: 65%


Security scan
make security
bandit -r src/ -ll && pip-audit


Smoke test
make smoke
Full pipeline with mock provider


Local backend
CAD_WEB_DEV_MODE=1 uvicorn web.backend.main:app --port 8322
Skips Firebase auth


Local frontend
cd web/frontend && npm run dev
Vite on :3000


Desktop app
make run
Requires pip install -e ".[gui]"


Build executable
make build
PyInstaller → dist/cad-dxf-agent/


Revision CLI
cad-revision diff master.dxf rev.dxf --output-dir ./out
Compare two DXFs


Deploy status
gh run list --workflow=deploy-web.yml
Latest deploy results


Cloud Run logs
gcloud run services logs read cad-dxf-web --region us-central1 --project cad-dxf-agent
Recent request logs


Rollback
See Section 4 rollback protocol
Traffic splitting to previous revision


Critical URLs


Production: https://cad-dxf-agent.web.app
Cloud Run service: gcloud run services describe cad-dxf-web --region us-central1 --project cad-dxf-agent
CI/CD: https://github.com/jeremylongshore/cad-dxf-agent/actions
Artifact Registry: us-central1-docker.pkg.dev/cad-dxf-agent/cad-dxf-agent/web-backend
Firebase Console: https://console.firebase.google.com/project/cad-dxf-agent
GCP Console: https://console.cloud.google.com/run?project=cad-dxf-agent
Cloud Trace: https://console.cloud.google.com/traces?project=cad-dxf-agent

First-Week Checklist


 GCP access granted (gcloud auth login with project cad-dxf-agent)
 GitHub repo access (push to branches, not main)
 gcloud auth application-default login for local Gemini access
 pip install -e ".[dev]" + pre-commit install
 make check passes locally (all green)
 Understood two-axis classification: RequestClass × ObjectiveTag → pipeline
 Run make smoke to see full pipeline execute with mock provider
 Reviewed CLAUDE.md (project conventions, commit format, PR template)
 Read 000-docs/000-INDEX.md for doc inventory
 Completed a local web dev session (upload DXF → prompt → preview → apply)
 Reviewed deploy-web.yml to understand the auto-deploy pipeline
 Understood WIF authentication (no secrets — vars in GitHub repo settings)

Appendices

A. Environment Variables Reference


Variable
Default
Purpose


CAD_LLM_PROVIDER
mock
gemini for prod/dev, mock for CI


CAD_GCP_PROJECT
(none)
GCP project ID (required for Vertex AI)


CAD_GCP_LOCATION
us-central1
Vertex AI region


CAD_GEMINI_MODEL
gemini-2.5-flash
Gemini model for planning


CAD_VISION_MODEL
gemini-2.5-flash
Gemini model for vision description


CAD_PROTECTED_LAYERS
TITLE,TITLEBLOCK,SEAL,REVISION
Layers the LLM cannot edit


CAD_REVISION_NOTES_ENABLED
true
Insert deterministic revision notes


CAD_REVISION_NOTES_LAYER
AI_REV_NOTES
Layer for revision notes


CAD_LLM_TEMPERATURE
0.0
Gemini temperature (0 = deterministic)


CAD_LLM_MAX_OUTPUT_TOKENS
4096
Max response tokens


CAD_PLANNER_TIMEOUT
60
Planner timeout (seconds)


CAD_PLANNER_MAX_RETRIES
2
Retry count on planner failure


CAD_RENDER_DPI
150
PNG render resolution


CAD_MAX_UNDO_SNAPSHOTS
50
Edit history depth


CAD_VISION_ENABLED
true
Enable DXF → image → description pipeline


CAD_ODA_PATH
(auto)
ODA File Converter path (DWG support)


CAD_WEB_DEV_MODE
(unset)
Skip Firebase auth for local dev (1)


CAD_WEB_CORS_ORIGIN
(unset)
Additional CORS origin


CAD_ALLOWED_EMAILS
(unset)
Semicolon-separated emails for auto-provisioning


CAD_PROXY_URL
(unset)
Cloud Run proxy for desktop


CAD_LICENSE_KEY
(unset)
Proxy authentication key


OTEL_ENABLED
(unset)
Enable tracing (1, true, yes)


OTEL_EXPORTER
console
console, otlp, or gcp-trace


OTEL_EXPORTER_OTLP_ENDPOINT
(unset)
OTLP collector URL


B. CI/CD Workflows


Workflow
Trigger
Jobs
Duration


ci.yml
Push to main, all PRs
lint, typecheck, test (matrix 3.11+3.12), benchmark (main only), live-test (main only)
~3-5 min


deploy-web.yml
Push to main (web/src changes), manual
deploy-backend (Docker → Cloud Run), deploy-frontend (npm → Firebase)
~5-8 min


security.yml
Push to main, all PRs
bandit, pip-audit
~2 min


build-windows.yml
Tag push (v*), manual
PyInstaller build, Inno Setup installer, upload artifacts
~10 min


gemini-review.yml
PRs
AI code review
~2 min


canary-monitoring.yml
Scheduled/manual
Production canary checks
~2 min


publish-pypi.yml
Manual
PyPI publish
~2 min


release-dryrun.yml
Manual
Validate release artifacts
~3 min


C. Test Tiers


Tier
Location
Count
Runner
Notes


Unit
tests/unit/
~3,618
make test-unit
Fast, mocked, all CI runs


Integration
tests/integration/
~102
make test-integration
Full pipeline, ScriptedAgentProvider


Web API
tests/web/
~418
make test-web
FastAPI TestClient


Eval
tests/eval/
~238
make scorecard
Intent classification scorecard


Live API
tests/live/
~42
CI (main only)
Real Gemini via WIF


E2E
tests/e2e/
~33
make test-e2e
End-to-end with real DXF files


Benchmark
tests/benchmark/
~19
CI (main only)
pytest-benchmark, JSON artifacts


GUI
tests/gui/
~10
Manual
Requires QT_QPA_PLATFORM=offscreen


Property
tests/property/
~7
CI
Randomized, bounded runtime


Smoke
tests/smoke/
~7
make smoke
End-to-end mock pipeline


Total

~4,494


D. Glossary


Term
Meaning


DrawingContext
Normalized Pydantic model of a loaded DXF (entities, layers, blocks, metadata)


EntityRef
Single DXF entity reference (handle, type, layer, position, text, block)


ChangeSet
Batch of EditOperations from a single user prompt


OpType
Edit operation type enum (13 values: move, edit_text, delete, add_block, rotate, copy, scale, mirror, add_line, add_polyline, add_circle, add_arc, add_text)


RequestClass
Classification axis 1 — what: edit, analyze, compare, query, generate


ObjectiveTag
Classification axis 2 — why: compliance, coordination, documentation, estimation, quality, general


StagePipelineDefinition
Ordered list of StageHandlers selected by StrategyRegistry for a (RequestClass, ObjectiveTag) pair


PlatformResponse
Response envelope with TaskFamily, ResponseType, RiskLevel, AuditMetadata


AgentProvider
Iterative tool-use loop (max 10 turns) for complex multi-step requests


ToolExecutor
Dispatches 20+ query and edit tools with protected-layer enforcement


Protected layer
Layer that cannot be edited (TITLE, TITLEBLOCK, SEAL, REVISION)


TaskFamily
Intent category (QNA, EDIT_PLAN, COMPARE, SUMMARY, COMPLIANCE, HEALTH, TAKEOFF, RFI, etc.)


WIF
Workload Identity Federation — GCP's secretless auth for CI/CD


ADC
Application Default Credentials — local GCP auth via gcloud auth application-default login


ODA
Open Design Alliance File Converter — DWG → DXF conversion tool


Save-as
Architectural invariant: original files are never modified; edits produce new files


E. Troubleshooting Playbooks

Tests fail with ModuleNotFoundError: No module named 'ezdxf':
System pytest doesn't have project deps. Use .venv/bin/python -m pytest -v instead of bare pytest.
Cloud Run deploy fails:

Check gh run list --workflow=deploy-web.yml for the failing step
Verify WIF vars are set: gh variable list (should show WIF_PROVIDER, WIF_SERVICE_ACCOUNT, GCP_PROJECT_ID)
Check Artifact Registry permissions: the WIF service account needs roles/artifactregistry.writer

ODA .deb download fails in CI:

Check GCS bucket: gsutil ls gs://cad-dxf-agent-deps/oda/
If missing, DWG support is unavailable but DXF/PDF uploads work fine
Size validation catches corrupt downloads (<1MB = skip install)

User login fails:

Check Firebase Auth console for the user's email
Verify CAD_ALLOWED_EMAILS env var or Firestore allowlist collection includes the email
Check Cloud Run logs for auth validation errors

Document persistence fails:

Check GCS bucket access: gsutil ls gs://cad-dxf-agent-documents/
Verify Cloud Run SA has roles/storage.objectAdmin on the bucket
Check Firestore for tenant/user records


## cad-dxf-agent-full-audit.md

      
    Raw
  

              cad-dxf-agent-full-audit.md
            
          
    IntentCAD App Audit

Version: v0.9.0 | Date: 2026-03-09 | Status: Beta (30 epics shipped)

What It Does

IntentCAD is a Drawing Intelligence Platform for AEC professionals. Upload a DXF drawing, describe what you need in plain English — an edit, a compliance check, a quantity takeoff, a health report — and the platform classifies your intent, selects the right processing pipeline, and delivers structured results. The original file is never modified.
Supported inputs: DXF (native), DWG/PDF (via conversion pipeline).
Architecture

IntentCAD uses a two-axis intent classification system with composable stage pipelines:
Prompt → ObjectiveClassifier (RequestClass × ObjectiveTag)
       → StrategyRegistry (maps to StagePipelineDefinition)
       → StageExecutor (runs ordered stages: deterministic + LLM)
       → ResponseBuilder (PlatformResponse envelope)

For edit requests, the stage pipeline includes:
Planner → ChangeSet → Validator → Preview → EditEngine → Save-As DXF + RevisionNotes

For analysis requests (compliance, health, takeoff, summary, RFI, zones), the pipeline runs deterministic extractors without the edit flow.
For complex requests, an Agent Mode runs an iterative tool-use loop (up to 10 turns) with 20+ query and edit tools.
The LLM returns structured JSON operations — never raw DXF. Every operation is validated against safety rules before anything touches the drawing. If validation fails, the entire changeset is rejected.
Tech Stack


Layer
Technology


Backend
Python 3.11+, FastAPI, ezdxf


Frontend
React + Vite (TypeScript)


Auth
Firebase Authentication (Google Sign-In)


Hosting
Firebase Hosting (frontend), Cloud Run (backend)


LLM
Vertex AI — Gemini (tool-use with vision)


Storage
GCS (documents), Firestore (user profiles, tenants)


Observability
OpenTelemetry (console, OTLP, GCP Cloud Trace)


CI/CD
GitHub Actions (auto-deploy via WIF), pre-commit hooks, ruff, mypy


Current Metrics


Metric
Value


Epics completed
30


API endpoints
20+


Task families
11 (8 enabled by default)


Automated tests
4,494


Test tiers
10 (unit, integration, web, eval, live, e2e, benchmark, gui, property, smoke)


Coverage threshold
65%


Entity types supported
7 (LINE, LWPOLYLINE, TEXT, MTEXT, INSERT, CIRCLE, ARC)


Edit operation types
13


Pydantic schemas
30


Core modules
41


LLM modules
22


Capabilities


Capability
Description


Edit
Move, rotate, copy, scale, mirror, delete entities; add lines, polylines, circles, arcs, text, blocks


Compliance
ADA/IBC/custom rule validation with findings and remediation guidance


Health Report
Drawing quality metrics — layer hygiene, entity stats, potential issues


Quantity Takeoff
Automated extraction of counts, lengths, areas from drawing entities


Summary
Plain-English structured narrative of drawing contents


RFI Generation
Automated Request For Information based on detected ambiguities


Zone Detection
Closed-loop room/area detection with area calculation


Revision Comparison
Diff two DXF versions, review changes, apply approved edits


Agent Mode
Iterative multi-turn tool-use loop for complex requests (max 10 turns)


Two-Axis Intent Classification

Every prompt is classified on two independent axes:

RequestClass — what the user wants done: edit, analyze, compare, query, generate
ObjectiveTag — why they want it: compliance, coordination, documentation, estimation, quality, general

The StrategyRegistry maps each (RequestClass, ObjectiveTag) pair to a StagePipelineDefinition.
Security Model


Firebase Auth with server-side token validation and allowlist gating
Protected layers — TITLE, TITLEBLOCK, SEAL, REVISION are immutable; enforced at both validator and ToolExecutor levels
LLM isolation — the model never reads or writes DXF directly; it produces structured operations that are validated before execution
Save-as workflow — original files are never modified
Revision notes are generated deterministically from operation metadata, never from freeform LLM output
Tenant isolation — GCS paths scoped to documents/{tenant_id}/{user_id}/{doc_id}/
Session isolation — uploads scoped to per-session temp directories with 2-hour expiry

Limitations (Beta)

These are real constraints, not roadmap items dressed up as features:

7 entity types — LINE, LWPOLYLINE, TEXT, MTEXT, INSERT, CIRCLE, ARC. Complex entities (SPLINE, HATCH, DIMENSION, SOLID, etc.) are skipped during load. This covers common structural/architectural drawings but not all DXF content.
LLM planning quality varies — simple operations (move, delete, edit text) are reliable. Complex multi-step edits on dense drawings can produce incorrect targeting.
Cloud-dependent for web — requires backend API for planning and execution. Desktop app runs locally.
Text extraction has a trust hierarchy — OCR-derived text is treated as lower confidence than native DXF text.

Development Philosophy


Truth over momentum — metrics are real, not aspirational. If something doesn't work, we say so.
Deterministic where possible — the LLM proposes, deterministic code disposes. Validation, editing, revision notes, and file I/O are all non-stochastic.
Structured outputs over freeform generation — the LLM returns typed operations via tool-use, not prose that gets parsed with regex.
Test-driven — 4,494 automated tests across 10 tiers, capability scorecard, golden trajectory regression suite.
Pipeline with agent escape hatch — explicit stages with clear contracts between them. Agent mode available for complex multi-step requests that need iterative tool use.


## SKILL.md

      
    Raw
  

              SKILL.md
            
          
  name
  description
  allowed-tools
  model
  
  
  appaudit
  Operator-grade system analysis for DevOps onboarding. Use when auditing a new codebase or creating operations playbooks.
  Read,Bash,Glob,Grep,Write
  claude-opus-4-5-20251101
  
  
Universal Operator-Grade System Analysis

Produce the definitive operational guide for incoming DevOps engineers. Lead with an operator-first mindset: practical, privacy-conscious, and optimized for hands-on maintainers.
Core Objective

Deliver a complete, verifiable analysis that equips DevOps engineers to:

Understand architecture, data flows, and business value end-to-end
Deploy, monitor, and troubleshoot services without supervision
Prioritize improvements using quantified impact and risk
Communicate system status credibly to stakeholders

Instructions

Phase 1: Initial Survey (30 min)


Read README.md, CHANGELOG.md, project root files
Check for CLAUDE.md, AGENTS.md, operational docs
Identify package manifests (package.json, requirements.txt, go.mod)
Scan infrastructure directories (terraform/, infrastructure/, k8s/)
Find CI/CD configs (.github/workflows/, .gitlab-ci.yml)
Review configuration files (.env.example, docker-compose.yml)

Phase 2: Deep Dive (60 min)


Analyze source code structure and architectural patterns
Review test coverage and quality metrics
Examine deployment workflows and environments
Study monitoring, logging, and alerting setup
Assess security controls and compliance posture
Document dependencies and third-party integrations

Phase 3: Synthesis (30 min)


Identify gaps, risks, and immediate priorities
Create actionable recommendations with timelines
Generate comprehensive operations playbook

Document Numbering

Before creating output:
LAST_NUM=$(ls -1 000-docs/ 2>/dev/null | grep -E "^[0-9]{3}-" | tail -1 | cut -d'-' -f1)
NEXT_NUM=$(printf "%03d" $((10#${LAST_NUM:-0} + 1)))
# Output: 000-docs/${NEXT_NUM}-AA-AUDT-appaudit-devops-playbook.md
Output Template

Create 000-docs/NNN-AA-AUDT-appaudit-devops-playbook.md:
# [PROJECT_NAME]: Operator-Grade System Analysis
*For: DevOps Engineer*
*Generated: [Date]*
*Version: [git tag/commit]*

## 1. Executive Summary
### Business Purpose
[3-4 paragraphs: capabilities, status, tech foundation, risks]

### Operational Status Matrix
| Environment | Status | Uptime Target | Release Cadence |
|-------------|--------|---------------|-----------------|
| Production  |        |               |                 |
| Staging     |        |               |                 |

### Technology Stack
| Category | Technology | Version | Purpose |
|----------|------------|---------|---------|

## 2. System Architecture
### Technology Stack (Detailed)
| Layer | Technology | Version | Purpose | Owner |
|-------|------------|---------|---------|-------|
| Frontend | | | | |
| Backend | | | | |
| Database | | | | |
| Infrastructure | | | | |

### Architecture Diagram
[ASCII diagram showing services, data flows, failure domains]

## 3. Directory Analysis
### Project Structure
[Repository layout with purpose annotations]

### Key Directories
- **src/**: [patterns, entry points, integrations]
- **tests/**: [framework, coverage %, gaps]
- **infrastructure/**: [IaC, networking, secrets]

## 4. Operational Reference
### Deployment Workflows
#### Local Development
1. Prerequisites: [tools, versions]
2. Setup: [commands]
3. Verification: [smoke tests]

#### Production Deployment
- Pre-flight checklist
- Execution steps
- Rollback protocol

### Monitoring & Alerting
- Dashboards: [URLs]
- SLIs/SLOs: [targets]
- On-call: [rotation, escalation]

### Incident Response
| Severity | Definition | Response Time | Playbook |
|----------|------------|---------------|----------|
| P0 | System outage | Immediate | |
| P1 | Critical degradation | 15 min | |
| P2 | Partial impact | 1 hour | |

## 5. Security & Access
### IAM
| Role | Purpose | Permissions | MFA |
|------|---------|-------------|-----|

### Secrets Management
- Storage: [mechanism]
- Rotation: [policy]
- Break-glass: [procedure]

## 6. Cost & Performance
### Monthly Costs
- Compute: $X
- Storage: $X
- Databases: $X
- Total: $X

### Performance Baseline
- Latency: P50/P95/P99
- Throughput: [req/sec]
- Error budget: [%]

## 7. Current State Assessment
### What's Working
✅ [List with evidence]

### Areas Needing Attention
⚠️ [Tech debt, gaps, risks]

### Immediate Priorities
1. **[High]** – [Issue] • Impact: [X] • Owner: [Y]
2. **[Medium]** – [Issue] • Impact: [X] • Owner: [Y]

## 8. Quick Reference
### Command Map
| Capability | Command | Notes |
|------------|---------|-------|
| Local env | | |
| Run tests | | |
| Deploy staging | | |
| Deploy prod | | |
| View logs | | |
| Rollback | | |

### Critical URLs
- Production: [URL]
- Staging: [URL]
- Monitoring: [URL]
- CI/CD: [URL]

### First-Week Checklist
- [ ] Access granted (repos, cloud, secrets)
- [ ] Local environment working
- [ ] Completed staging deploy
- [ ] Reviewed runbooks
- [ ] Understood on-call rotation

## 9. Recommendations Roadmap
### Week 1 – Stabilization
[Goals with measurable outcomes]

### Month 1 – Foundation
[Goals with measurable outcomes]

### Quarter 1 – Strategic
[Goals with measurable outcomes]

## Appendices
- Glossary
- Reference Links
- Troubleshooting Playbooks
- Open Questions
Examples

Input: /appaudit
Output: Creates 000-docs/NNN-AA-AUDT-appaudit-devops-playbook.md with full analysis.
Input: Running on a Python FastAPI project
Output:
Document Created: 000-docs/042-AA-AUDT-appaudit-devops-playbook.md

Critical Findings:
1. [HIGH] No database backup automation
2. [MEDIUM] Test coverage at 34%
3. [LOW] Outdated dependencies

System Health Score: 67/100
- Architecture: 8/10
- Operations: 6/10
- Security: 7/10
- Documentation: 5/10

Immediate Actions:
1. Implement automated backups → Owner: DevOps
2. Add CI test coverage gate → Owner: Platform

Error Handling

If 000-docs/ doesn't exist:
mkdir -p 000-docs
If no package manifests found:

Note as gap in assessment
Check for alternative build systems (Makefile, scripts/)

If infrastructure directory missing:

Document as "Infrastructure as Code: Not implemented"
Add to recommendations

Writing Guidelines

Tone


Direct: Trust reader's expertise
Honest: Call out gaps and unknowns
Specific: Include paths, configs, metrics
Actionable: Pair findings with next steps

Avoid


Generic best practices without context
Tutorial-style explanations
Vague risk statements
Unverified assumptions
Invented commands or paths

Quality Standards


Validate all paths/commands against actual codebase
Note sources for all claims
Structure for easy updates
Target: 10,000-20,000 words
Success: DevOps engineer operates independently after reading
Environment	Status	Uptime Target	Release Cadence
Production (Web)	Active	Best-effort (Cloud Run default SLA)	Merge-to-main auto-deploy
Desktop	Builds available	N/A (local)	Tag-triggered (v* tags)
CI	Green (main)	N/A	Every push/PR
Staging	None (direct-to-prod)	N/A	N/A
Category	Technology	Version	Purpose
Language	Python	3.11 / 3.12	Core pipeline, backend
DXF Engine	ezdxf	>=1.3.0	DXF read/write/entity manipulation
Data Models	Pydantic	>=2.0	Schema validation, serialization
LLM	Gemini (Vertex AI)	gemini-2.5-flash	Edit planning, vision, agent tool-use
Backend	FastAPI + Uvicorn	>=0.115.0	REST API for web frontend
Frontend	React 18 + Vite	18.3.1 / 6.0.5	SPA interface
DXF Viewer	dxf-viewer + Three.js	1.0.46 / 0.183.2	WebGL drawing preview
Auth	Firebase Authentication	11.0.0	Google Sign-In
Hosting	Firebase Hosting	—	Static SPA delivery
Compute	Cloud Run	—	Containerized backend (8Gi/4CPU)
Storage	GCS + Firestore	—	Document persistence, user profiles, tenants
Registry	Artifact Registry	—	Docker images (us-central1)
Tracing	OpenTelemetry → Cloud Trace	>=1.21	Pipeline span instrumentation
Desktop UI	PySide6	>=6.6	Qt-based desktop shell
CI/CD	GitHub Actions	—	Lint, test, deploy (WIF auth)
Linting	Ruff	>=0.5	Lint + format
Type Check	Mypy	>=1.10	Static type analysis
Security	Bandit + pip-audit	—	SAST + dependency audit
Build	Hatchling + PyInstaller	—	Package + desktop executable
Domain	Impact	Mitigation
Gemini API down	No edit planning (read/compare/analysis still work)	Mock provider fallback in CI; timeout + retry with backoff
Cloud Run cold start	5-15s latency spike	8Gi memory, min-instances=0 (cost tradeoff)
Firebase Hosting	Frontend unavailable	CDN-backed, rarely fails
ODA binary missing	DWG uploads return 422	DXF and PDF uploads unaffected; size-validated download in CI
Firestore down	User profiles/tenant creation fails	Cached in-process (5min/10min TTL); existing sessions unaffected
GCS down	Document persistence fails	Session-local copies still work; uploads/downloads degrade
Language	Files	Code Lines	Comments	Blanks
Python	360	77,795	3,897	16,383
JSON	44	9,726	0	8
JSX	24	4,190	220	320
JavaScript	24	3,429	650	717
CSS	6	1,807	62	301
Other	95	1,730	11,533	3,991
Total	553	98,677	16,362	21,720
Severity	Definition	Response	Playbook
P0	Web app completely down	Immediate — check Cloud Run status, rollback if deploy broke it	`gcloud run revisions list` → route traffic to last-good
P1	Gemini API failures (edit planning broken)	15 min — check Vertex AI status page, verify ADC credentials	Read-only features still work; users see "planning unavailable"
P2	ODA converter missing (DWG uploads 422)	Next business day — rebuild with ODA .deb from GCS	DXF and PDF uploads unaffected
P3	Test failures on main	Same day — fix or revert the breaking commit	`gh run list --workflow=ci.yml` → investigate
Role	Purpose	Permissions	Where
`cad-dxf-web-run` SA	Cloud Run runtime	Vertex AI API, Cloud Trace, GCS, Firestore	GCP IAM
WIF (GitHub Actions)	CI/CD deploy	Cloud Run deploy, Artifact Registry push, Firebase deploy, GCS read	Federated via `WIF_PROVIDER` / `WIF_SERVICE_ACCOUNT` (GitHub vars, not secrets)
Firebase Admin SDK	Token validation + Firestore	Firebase Auth read, Firestore read/write	Initialized in backend startup
Capability	Command	Notes
Install + setup	`pip install -e ".[dev]" && pre-commit install`	Editable install with all dev deps
All quality checks	`make check`	lint → format → typecheck → test → smoke
Lint only	`make lint`	`ruff check src/ tests/`
Format only	`make format`	`ruff format src/ tests/`
Type check	`make typecheck`	`mypy src/`
All tests	`.venv/bin/python -m pytest -v`	System pytest may lack ezdxf
Unit tests	`make test-unit`	~3,618 tests
Integration tests	`make test-integration`	~102 tests
Web API tests	`make test-web`	~418 tests, FastAPI TestClient
E2E tests	`make test-e2e`	~33 tests, real DXF files
Live Gemini tests	`make test-live`	~42 tests, requires ADC
Eval scorecard	`make scorecard`	~238 tests, mock mode
Coverage report	`make test-cov`	Threshold: 65%
Security scan	`make security`	`bandit -r src/ -ll && pip-audit`
Smoke test	`make smoke`	Full pipeline with mock provider
Local backend	`CAD_WEB_DEV_MODE=1 uvicorn web.backend.main:app --port 8322`	Skips Firebase auth
Local frontend	`cd web/frontend && npm run dev`	Vite on :3000
Desktop app	`make run`	Requires `pip install -e ".[gui]"`
Build executable	`make build`	PyInstaller → `dist/cad-dxf-agent/`
Revision CLI	`cad-revision diff master.dxf rev.dxf --output-dir ./out`	Compare two DXFs
Deploy status	`gh run list --workflow=deploy-web.yml`	Latest deploy results
Cloud Run logs	`gcloud run services logs read cad-dxf-web --region us-central1 --project cad-dxf-agent`	Recent request logs
Rollback	See Section 4 rollback protocol	Traffic splitting to previous revision
Variable	Default	Purpose
`CAD_LLM_PROVIDER`	`mock`	`gemini` for prod/dev, `mock` for CI
`CAD_GCP_PROJECT`	(none)	GCP project ID (required for Vertex AI)
`CAD_GCP_LOCATION`	`us-central1`	Vertex AI region
`CAD_GEMINI_MODEL`	`gemini-2.5-flash`	Gemini model for planning
`CAD_VISION_MODEL`	`gemini-2.5-flash`	Gemini model for vision description
`CAD_PROTECTED_LAYERS`	`TITLE,TITLEBLOCK,SEAL,REVISION`	Layers the LLM cannot edit
`CAD_REVISION_NOTES_ENABLED`	`true`	Insert deterministic revision notes
`CAD_REVISION_NOTES_LAYER`	`AI_REV_NOTES`	Layer for revision notes
`CAD_LLM_TEMPERATURE`	`0.0`	Gemini temperature (0 = deterministic)
`CAD_LLM_MAX_OUTPUT_TOKENS`	`4096`	Max response tokens
`CAD_PLANNER_TIMEOUT`	`60`	Planner timeout (seconds)
`CAD_PLANNER_MAX_RETRIES`	`2`	Retry count on planner failure
`CAD_RENDER_DPI`	`150`	PNG render resolution
`CAD_MAX_UNDO_SNAPSHOTS`	`50`	Edit history depth
`CAD_VISION_ENABLED`	`true`	Enable DXF → image → description pipeline
`CAD_ODA_PATH`	(auto)	ODA File Converter path (DWG support)
`CAD_WEB_DEV_MODE`	(unset)	Skip Firebase auth for local dev (`1`)
`CAD_WEB_CORS_ORIGIN`	(unset)	Additional CORS origin
`CAD_ALLOWED_EMAILS`	(unset)	Semicolon-separated emails for auto-provisioning
`CAD_PROXY_URL`	(unset)	Cloud Run proxy for desktop
`CAD_LICENSE_KEY`	(unset)	Proxy authentication key
`OTEL_ENABLED`	(unset)	Enable tracing (`1`, `true`, `yes`)
`OTEL_EXPORTER`	`console`	`console`, `otlp`, or `gcp-trace`
`OTEL_EXPORTER_OTLP_ENDPOINT`	(unset)	OTLP collector URL
Workflow	Trigger	Jobs	Duration
`ci.yml`	Push to main, all PRs	lint, typecheck, test (matrix 3.11+3.12), benchmark (main only), live-test (main only)	~3-5 min
`deploy-web.yml`	Push to main (web/src changes), manual	deploy-backend (Docker → Cloud Run), deploy-frontend (npm → Firebase)	~5-8 min
`security.yml`	Push to main, all PRs	bandit, pip-audit	~2 min
`build-windows.yml`	Tag push (v*), manual	PyInstaller build, Inno Setup installer, upload artifacts	~10 min
`gemini-review.yml`	PRs	AI code review	~2 min
`canary-monitoring.yml`	Scheduled/manual	Production canary checks	~2 min
`publish-pypi.yml`	Manual	PyPI publish	~2 min
`release-dryrun.yml`	Manual	Validate release artifacts	~3 min
Tier	Location	Count	Runner	Notes
Unit	`tests/unit/`	~3,618	`make test-unit`	Fast, mocked, all CI runs
Integration	`tests/integration/`	~102	`make test-integration`	Full pipeline, ScriptedAgentProvider
Web API	`tests/web/`	~418	`make test-web`	FastAPI TestClient
Eval	`tests/eval/`	~238	`make scorecard`	Intent classification scorecard
Live API	`tests/live/`	~42	CI (main only)	Real Gemini via WIF
E2E	`tests/e2e/`	~33	`make test-e2e`	End-to-end with real DXF files
Benchmark	`tests/benchmark/`	~19	CI (main only)	pytest-benchmark, JSON artifacts
GUI	`tests/gui/`	~10	Manual	Requires `QT_QPA_PLATFORM=offscreen`
Property	`tests/property/`	~7	CI	Randomized, bounded runtime
Smoke	`tests/smoke/`	~7	`make smoke`	End-to-end mock pipeline
Total		~4,494