Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save jeremylongshore/0303189683f9547c79e1fc1fc68be711 to your computer and use it in GitHub Desktop.

Select an option

Save jeremylongshore/0303189683f9547c79e1fc1fc68be711 to your computer and use it in GitHub Desktop.
IntentCAD — App Audit (v0.9.0, Beta, 30 epics shipped)

IntentCAD (cad-dxf-agent): Operator-Grade System Analysis

For: DevOps Engineer Generated: 2026-03-09 Version: v0.9.0 (30 epics shipped)

1. Executive Summary

Business Purpose

IntentCAD is a Drawing Intelligence Platform for AEC professionals. Users upload architectural drawings (DXF or PDF), describe what they need in plain English — an edit, a compliance check, a quantity takeoff, a health report — and the platform classifies intent on two axes, selects the right processing pipeline, and delivers structured results. Original files are never modified — every save produces a new file.

The system has shipped 30 epics across 9 phases, evolving from a local-first DXF editor into a multi-capability platform with compliance validation, health reports, quantity takeoff, drawing summaries, RFI generation, zone detection, revision comparison, agent mode, and user accounts with persistent workspaces. It ships as both a PySide6 desktop app (Windows/Linux) and a React + FastAPI web app deployed on Google Cloud (Firebase Hosting + Cloud Run). The LLM backend is Gemini via Vertex AI, with a mock provider for CI determinism.

The core architectural invariant: the LLM never touches DXF directly. It returns structured JSON operations (13 op types) which are validated against protected-layer rules before a deterministic edit engine applies them. This design eliminates a class of LLM hallucination risks at the architecture level.

Current risk profile: the system is well-tested (4,494 tests across 10 tiers, 65% coverage threshold) with green CI, automated deploys via WIF, and comprehensive security scanning. Primary operational risks are single-region deployment and the external ODA File Converter dependency for DWG support.

Operational Status Matrix

Environment Status Uptime Target Release Cadence
Production (Web) Active Best-effort (Cloud Run default SLA) Merge-to-main auto-deploy
Desktop Builds available N/A (local) Tag-triggered (v* tags)
CI Green (main) N/A Every push/PR
Staging None (direct-to-prod) N/A N/A

Technology Stack

Category Technology Version Purpose
Language Python 3.11 / 3.12 Core pipeline, backend
DXF Engine ezdxf >=1.3.0 DXF read/write/entity manipulation
Data Models Pydantic >=2.0 Schema validation, serialization
LLM Gemini (Vertex AI) gemini-2.5-flash Edit planning, vision, agent tool-use
Backend FastAPI + Uvicorn >=0.115.0 REST API for web frontend
Frontend React 18 + Vite 18.3.1 / 6.0.5 SPA interface
DXF Viewer dxf-viewer + Three.js 1.0.46 / 0.183.2 WebGL drawing preview
Auth Firebase Authentication 11.0.0 Google Sign-In
Hosting Firebase Hosting Static SPA delivery
Compute Cloud Run Containerized backend (8Gi/4CPU)
Storage GCS + Firestore Document persistence, user profiles, tenants
Registry Artifact Registry Docker images (us-central1)
Tracing OpenTelemetry → Cloud Trace >=1.21 Pipeline span instrumentation
Desktop UI PySide6 >=6.6 Qt-based desktop shell
CI/CD GitHub Actions Lint, test, deploy (WIF auth)
Linting Ruff >=0.5 Lint + format
Type Check Mypy >=1.10 Static type analysis
Security Bandit + pip-audit SAST + dependency audit
Build Hatchling + PyInstaller Package + desktop executable

2. System Architecture

Architecture Diagram

                         ┌─────────────────────────┐
                         │   Firebase Hosting       │
                         │   (React SPA)            │
                         │   cad-dxf-agent.web.app  │
                         └────────┬────────────────┘
                                  │ /api/* rewrite
                                  ▼
                         ┌─────────────────────────┐
                         │   Cloud Run              │
                         │   cad-dxf-web            │
                         │   FastAPI (8Gi/4CPU)     │
                         │   us-central1            │
                         └────────┬────────────────┘
                                  │
          ┌───────────────────────┼───────────────────────┐
          ▼                       ▼                       ▼
┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│  Pipeline Core   │  │   Vertex AI      │  │  Firebase/GCP    │
│  (ezdxf, 41      │  │   Gemini API     │  │  Auth, Firestore │
│   core modules,  │  │   (WIF auth)     │  │  GCS (documents) │
│   validators,    │  │                  │  │  Cloud Trace     │
│   edit engine)   │  │                  │  │                  │
└──────────────────┘  └──────────────────┘  └──────────────────┘

Pipeline Flow (Two-Axis Classification):
  User Prompt → ObjectiveClassifier (RequestClass × ObjectiveTag)
              → StrategyRegistry (maps to StagePipelineDefinition)
              → StageExecutor (ordered stages: deterministic + LLM)
              → ResponseBuilder (PlatformResponse envelope)

  Edit Pipeline:
    Planner(Gemini) → ChangeSet → Validator → Preview → EditEngine → Save-As DXF
                                                                         │
                                                               RevisionNotes (deterministic)

  Analysis Pipeline (compliance, health, takeoff, summary, RFI, zones):
    Deterministic extractors → structured results (no edit flow)

  Agent Mode (complex requests):
    Prompt + context + tools → Gemini → tool calls → ToolExecutor
      → results fed back → next iteration (max 10 turns)
      → final ChangeSet from accumulated tool calls

Desktop variant:
  PySide6 UI → same pipeline → local file I/O

Failure Domains

Domain Impact Mitigation
Gemini API down No edit planning (read/compare/analysis still work) Mock provider fallback in CI; timeout + retry with backoff
Cloud Run cold start 5-15s latency spike 8Gi memory, min-instances=0 (cost tradeoff)
Firebase Hosting Frontend unavailable CDN-backed, rarely fails
ODA binary missing DWG uploads return 422 DXF and PDF uploads unaffected; size-validated download in CI
Firestore down User profiles/tenant creation fails Cached in-process (5min/10min TTL); existing sessions unaffected
GCS down Document persistence fails Session-local copies still work; uploads/downloads degrade

3. Directory Analysis

Project Structure

cad-dxf-agent/
├── src/cad_dxf_agent/          # Core Python package (124 .py files, 23k LOC)
│   ├── models/                 # 30 Pydantic schemas (cad, ops, config, zone, compliance, etc.)
│   ├── core/                   # 41 modules — DXF I/O, validation, editing, analysis
│   │   └── comparison/         # Revision diff engine (alignment, matching, changelog, bundle)
│   ├── llm/                    # 22 modules — intent classification, planning, agent loop
│   │   └── stage_handlers/     # Stage pipeline handlers (analyze, comply, health, summarize, etc.)
│   ├── cli/                    # cad-revision CLI (diff/align/bundle/explain)
│   ├── ui/                     # PySide6 desktop GUI
│   ├── settings.py             # Env-based configuration (all CAD_* prefixed)
│   ├── otel.py                 # OpenTelemetry bootstrap (off by default)
│   └── app.py                  # Desktop entry point
├── web/
│   ├── backend/                # FastAPI on Cloud Run
│   │   ├── main.py             # ~2850 lines — all API routes (20+ endpoints)
│   │   ├── api_v1.py           # /api/v1 router
│   │   ├── auth.py             # Firebase token validation
│   │   ├── session.py          # SessionManager for ephemeral work
│   │   ├── Dockerfile          # Production image (Python 3.12-slim + ODA)
│   │   └── requirements.txt    # Backend-specific deps
│   ├── frontend/               # React 18 + Vite SPA
│   │   ├── src/pages/          # Upload, Editor, Compare, RevisionWizard, Documents
│   │   ├── src/components/     # Reusable UI components
│   │   └── package.json        # Frontend deps
│   ├── firebase.json           # Hosting config + /api/** → Cloud Run rewrite
│   ├── firestore.rules         # Client reads/writes denied (server-side only)
│   └── .firebaserc             # Project: cad-dxf-agent
├── proxy/                      # Cloud Run proxy for desktop licensing
│   ├── main.py                 # FastAPI, rate-limited Gemini forwarder
│   └── Dockerfile              # Minimal image
├── tests/                      # 224 test modules, 4,494 tests across 10 tiers
│   ├── unit/                   # ~3,618 tests — schemas, validators, reader, writer, engine, etc.
│   ├── integration/            # ~102 tests — full pipeline, agent loop (ScriptedAgentProvider)
│   ├── web/                    # ~418 tests — FastAPI TestClient endpoint tests
│   ├── eval/                   # ~238 tests — intent classification scorecard
│   ├── live/                   # ~42 tests — real Gemini API tests (WIF in CI)
│   ├── e2e/                    # ~33 tests — end-to-end with real DXF files
│   ├── benchmark/              # ~19 tests — pytest-benchmark micro-benchmarks
│   ├── gui/                    # ~10 tests — PySide6 tests (QT_QPA_PLATFORM=offscreen)
│   ├── property/               # ~7 tests — fuzz/property tests (randomized, bounded)
│   ├── smoke/                  # ~7 tests — end-to-end mock pipeline
│   ├── fixtures/               # DXF zoo, revision cases, trajectories, prompt bank
│   └── helpers/                # DXF factory, changeset factory, scripted provider
├── scripts/                    # Build, smoke test, eval runner, fixture downloads
├── 000-docs/                   # 64 architectural/planning documents
├── .github/workflows/          # 8 CI/CD workflows
├── Makefile                    # 67-line task runner
├── pyproject.toml              # Build config, tool settings, dep groups
└── .pre-commit-config.yaml     # Ruff, trailing whitespace, .env block, main protection

Codebase Metrics

Language Files Code Lines Comments Blanks
Python 360 77,795 3,897 16,383
JSON 44 9,726 0 8
JSX 24 4,190 220 320
JavaScript 24 3,429 650 717
CSS 6 1,807 62 301
Other 95 1,730 11,533 3,991
Total 553 98,677 16,362 21,720

4. Operational Reference

Deployment Workflows

Local Development

  1. Prerequisites: Python 3.11+, Node.js 22+, gcloud CLI
  2. Setup:
    # Python backend
    pip install -e ".[dev]"
    pre-commit install
    gcloud auth application-default login  # One-time GCP auth
    
    # Create .env (gitignored)
    echo 'CAD_LLM_PROVIDER=gemini' > .env
    echo 'CAD_GCP_PROJECT=cad-dxf-agent' >> .env
    
    # Frontend
    cd web/frontend && npm ci
  3. Run:
    # Backend on :8322
    CAD_WEB_DEV_MODE=1 uvicorn web.backend.main:app --port 8322
    
    # Frontend on :3000
    cd web/frontend && npm run dev
  4. Verification: make check (lint → format → typecheck → test → smoke)

Production Deployment

Normal path (automated): Merge PR to main touching web/** or src/** → GitHub Actions deploy-web.yml fires → builds Docker image → pushes to Artifact Registry → deploys Cloud Run → deploys Firebase Hosting. No manual steps.

Pre-flight checklist:

  • All CI checks green on PR
  • make check passes locally
  • PR reviewed and approved
  • No secrets in diff

Manual deploy (emergency only):

# ALWAYS specify --project (local gcloud may point elsewhere)
cd web/frontend && npm run build
firebase deploy --only hosting --project cad-dxf-agent

gcloud run deploy cad-dxf-web \
  --source . --dockerfile web/backend/Dockerfile \
  --region us-central1 --project cad-dxf-agent \
  --allow-unauthenticated --memory 8Gi --cpu 4 --timeout 600 \
  --service-account cad-dxf-web-run@cad-dxf-agent.iam.gserviceaccount.com \
  --set-env-vars CAD_LLM_PROVIDER=gemini,CAD_GCP_PROJECT=cad-dxf-agent,OTEL_ENABLED=1,OTEL_EXPORTER=gcp-trace

Do NOT use: gcloud builds submit --config cloudbuild.yaml$SHORT_SHA is only set by triggers, not manual submits.

Rollback protocol:

# List recent revisions
gcloud run revisions list --service cad-dxf-web --region us-central1 --project cad-dxf-agent

# Route traffic to previous revision
gcloud run services update-traffic cad-dxf-web \
  --to-revisions=PREVIOUS_REVISION=100 \
  --region us-central1 --project cad-dxf-agent

Monitoring & Alerting

  • Cloud Trace: All pipeline stages emit OTel spans (cad.load_dxf, cad.run_planner, cad.validate, cad.build_context, etc.). Enabled via OTEL_ENABLED=1 + OTEL_EXPORTER=gcp-trace.
  • Cloud Run Logs: gcloud run services logs read cad-dxf-web --region us-central1 --project cad-dxf-agent
  • CI Status: gh run list --workflow=ci.yml and gh run list --workflow=deploy-web.yml
  • SLIs: No formal SLOs defined yet. Cloud Run provides built-in request latency, error rate, and instance count metrics.
  • Dashboards: GCP Console → Cloud Run → cad-dxf-web service page (built-in metrics)
  • On-call: No rotation — single-developer project.

Incident Response

Severity Definition Response Playbook
P0 Web app completely down Immediate — check Cloud Run status, rollback if deploy broke it gcloud run revisions list → route traffic to last-good
P1 Gemini API failures (edit planning broken) 15 min — check Vertex AI status page, verify ADC credentials Read-only features still work; users see "planning unavailable"
P2 ODA converter missing (DWG uploads 422) Next business day — rebuild with ODA .deb from GCS DXF and PDF uploads unaffected
P3 Test failures on main Same day — fix or revert the breaking commit gh run list --workflow=ci.yml → investigate

5. Security & Access

IAM

Role Purpose Permissions Where
cad-dxf-web-run SA Cloud Run runtime Vertex AI API, Cloud Trace, GCS, Firestore GCP IAM
WIF (GitHub Actions) CI/CD deploy Cloud Run deploy, Artifact Registry push, Firebase deploy, GCS read Federated via WIF_PROVIDER / WIF_SERVICE_ACCOUNT (GitHub vars, not secrets)
Firebase Admin SDK Token validation + Firestore Firebase Auth read, Firestore read/write Initialized in backend startup

Secrets Management

  • No stored secrets: WIF provides tokenless authentication from GitHub Actions to GCP. No API keys, service account JSON files, or secrets in GitHub.
  • Firebase API keys: Public-safe client config (hardcoded in deploy-web.yml). These are designed to be public per Firebase documentation.
  • Local dev: gcloud auth application-default login provides ADC credentials. .env file is gitignored.
  • Break-glass: If WIF breaks, manual deploy uses developer's own gcloud auth credentials with --project cad-dxf-agent.

Pre-commit Security Gates

  • detect-private-key: Blocks commits containing private keys
  • forbid-env-files: Blocks .env file commits
  • no-commit-to-branch: Prevents direct commits to main
  • check-added-large-files: Blocks files >1MB (catches accidental binary commits)
  • bandit: Python SAST on every CI run
  • pip-audit: Dependency vulnerability scan on every CI run

6. Cost & Performance

Monthly Costs (estimated)

  • Cloud Run: ~$5-20/mo (low traffic, scale-to-zero, 8Gi/4CPU per request)
  • Vertex AI (Gemini): ~$10-50/mo (depends on request volume; gemini-2.5-flash pricing)
  • Firebase Hosting: Free tier (SPA CDN)
  • Firebase Auth: Free tier (<50k MAU)
  • Firestore: Free tier (user profiles, tenants, allowlist)
  • GCS: ~$1-5/mo (document storage)
  • Artifact Registry: ~$1/mo (container storage)
  • Cloud Trace: Free tier (first 5M spans/mo)
  • Total: ~$20-80/mo at current usage

Performance Baseline

  • DXF load: <100ms for 200-entity drawings, ~500ms for 1000-entity (benchmarked in tests/benchmark/)
  • Gemini planning: 2-8s per edit prompt (network + inference)
  • Agent mode: 5-30s (multi-turn, up to 10 iterations)
  • Validation: <1ms per operation
  • Edit engine: <10ms per changeset application
  • Cloud Run cold start: 5-15s (Python image + ODA libraries)
  • Web API P95: ~3-10s end-to-end (dominated by Gemini latency)

7. Current State Assessment

What's Working

  • Comprehensive CI: Lint (ruff), format, typecheck (mypy), 4,494 tests across 10 tiers, security scans — all automated on push/PR
  • Automated deploys: Merge to main → GitHub Actions deploys both frontend + backend via WIF. Zero manual steps.
  • Multi-capability platform: Edit, compliance, health, takeoff, summary, RFI, zone detection, revision comparison, agent mode — all production-ready
  • Two-axis intent classification: Every prompt classified by RequestClass (what) × ObjectiveTag (why), routed to the right pipeline
  • User accounts: Firebase Auth (Google Sign-In), Firestore tenants/profiles, GCS document persistence with work progress
  • Safety architecture: LLM never touches DXF directly; protected layers enforced at validator + ToolExecutor; deterministic revision notes; save-as workflow
  • Agent mode: Iterative tool-use loop (20+ tools, max 10 turns) for complex multi-step requests
  • Modern tooling: 30 Pydantic schemas, Ruff linting, syrupy snapshots, pytest-benchmark, OpenTelemetry tracing
  • Strong documentation: 64 docs covering architecture decisions, specs, audit reports, and epic AARs
  • WIF authentication: No secrets stored anywhere — tokenless GCP access from CI

Areas Needing Attention

  • No staging environment: Production deploys go direct-to-prod. A staging Cloud Run service would catch deploy issues before users see them.
  • Single-region: Cloud Run only in us-central1. No multi-region failover.
  • No CODEOWNERS: No automated review assignment for critical paths.
  • Proxy service: Not deployed via CI — ad-hoc manual deploys. No monitoring.
  • ODA dependency: External binary downloaded from GCS. If the bucket or file is lost, DWG support breaks.
  • Desktop build: Windows-only PyInstaller builds. Linux desktop builds not automated.
  • No dependency pinning: pyproject.toml uses >= ranges. No lock file for reproducible builds.

8. Quick Reference

Command Map

Capability Command Notes
Install + setup pip install -e ".[dev]" && pre-commit install Editable install with all dev deps
All quality checks make check lint → format → typecheck → test → smoke
Lint only make lint ruff check src/ tests/
Format only make format ruff format src/ tests/
Type check make typecheck mypy src/
All tests .venv/bin/python -m pytest -v System pytest may lack ezdxf
Unit tests make test-unit ~3,618 tests
Integration tests make test-integration ~102 tests
Web API tests make test-web ~418 tests, FastAPI TestClient
E2E tests make test-e2e ~33 tests, real DXF files
Live Gemini tests make test-live ~42 tests, requires ADC
Eval scorecard make scorecard ~238 tests, mock mode
Coverage report make test-cov Threshold: 65%
Security scan make security bandit -r src/ -ll && pip-audit
Smoke test make smoke Full pipeline with mock provider
Local backend CAD_WEB_DEV_MODE=1 uvicorn web.backend.main:app --port 8322 Skips Firebase auth
Local frontend cd web/frontend && npm run dev Vite on :3000
Desktop app make run Requires pip install -e ".[gui]"
Build executable make build PyInstaller → dist/cad-dxf-agent/
Revision CLI cad-revision diff master.dxf rev.dxf --output-dir ./out Compare two DXFs
Deploy status gh run list --workflow=deploy-web.yml Latest deploy results
Cloud Run logs gcloud run services logs read cad-dxf-web --region us-central1 --project cad-dxf-agent Recent request logs
Rollback See Section 4 rollback protocol Traffic splitting to previous revision

Critical URLs

First-Week Checklist

  • GCP access granted (gcloud auth login with project cad-dxf-agent)
  • GitHub repo access (push to branches, not main)
  • gcloud auth application-default login for local Gemini access
  • pip install -e ".[dev]" + pre-commit install
  • make check passes locally (all green)
  • Understood two-axis classification: RequestClass × ObjectiveTag → pipeline
  • Run make smoke to see full pipeline execute with mock provider
  • Reviewed CLAUDE.md (project conventions, commit format, PR template)
  • Read 000-docs/000-INDEX.md for doc inventory
  • Completed a local web dev session (upload DXF → prompt → preview → apply)
  • Reviewed deploy-web.yml to understand the auto-deploy pipeline
  • Understood WIF authentication (no secrets — vars in GitHub repo settings)

Appendices

A. Environment Variables Reference

Variable Default Purpose
CAD_LLM_PROVIDER mock gemini for prod/dev, mock for CI
CAD_GCP_PROJECT (none) GCP project ID (required for Vertex AI)
CAD_GCP_LOCATION us-central1 Vertex AI region
CAD_GEMINI_MODEL gemini-2.5-flash Gemini model for planning
CAD_VISION_MODEL gemini-2.5-flash Gemini model for vision description
CAD_PROTECTED_LAYERS TITLE,TITLEBLOCK,SEAL,REVISION Layers the LLM cannot edit
CAD_REVISION_NOTES_ENABLED true Insert deterministic revision notes
CAD_REVISION_NOTES_LAYER AI_REV_NOTES Layer for revision notes
CAD_LLM_TEMPERATURE 0.0 Gemini temperature (0 = deterministic)
CAD_LLM_MAX_OUTPUT_TOKENS 4096 Max response tokens
CAD_PLANNER_TIMEOUT 60 Planner timeout (seconds)
CAD_PLANNER_MAX_RETRIES 2 Retry count on planner failure
CAD_RENDER_DPI 150 PNG render resolution
CAD_MAX_UNDO_SNAPSHOTS 50 Edit history depth
CAD_VISION_ENABLED true Enable DXF → image → description pipeline
CAD_ODA_PATH (auto) ODA File Converter path (DWG support)
CAD_WEB_DEV_MODE (unset) Skip Firebase auth for local dev (1)
CAD_WEB_CORS_ORIGIN (unset) Additional CORS origin
CAD_ALLOWED_EMAILS (unset) Semicolon-separated emails for auto-provisioning
CAD_PROXY_URL (unset) Cloud Run proxy for desktop
CAD_LICENSE_KEY (unset) Proxy authentication key
OTEL_ENABLED (unset) Enable tracing (1, true, yes)
OTEL_EXPORTER console console, otlp, or gcp-trace
OTEL_EXPORTER_OTLP_ENDPOINT (unset) OTLP collector URL

B. CI/CD Workflows

Workflow Trigger Jobs Duration
ci.yml Push to main, all PRs lint, typecheck, test (matrix 3.11+3.12), benchmark (main only), live-test (main only) ~3-5 min
deploy-web.yml Push to main (web/src changes), manual deploy-backend (Docker → Cloud Run), deploy-frontend (npm → Firebase) ~5-8 min
security.yml Push to main, all PRs bandit, pip-audit ~2 min
build-windows.yml Tag push (v*), manual PyInstaller build, Inno Setup installer, upload artifacts ~10 min
gemini-review.yml PRs AI code review ~2 min
canary-monitoring.yml Scheduled/manual Production canary checks ~2 min
publish-pypi.yml Manual PyPI publish ~2 min
release-dryrun.yml Manual Validate release artifacts ~3 min

C. Test Tiers

Tier Location Count Runner Notes
Unit tests/unit/ ~3,618 make test-unit Fast, mocked, all CI runs
Integration tests/integration/ ~102 make test-integration Full pipeline, ScriptedAgentProvider
Web API tests/web/ ~418 make test-web FastAPI TestClient
Eval tests/eval/ ~238 make scorecard Intent classification scorecard
Live API tests/live/ ~42 CI (main only) Real Gemini via WIF
E2E tests/e2e/ ~33 make test-e2e End-to-end with real DXF files
Benchmark tests/benchmark/ ~19 CI (main only) pytest-benchmark, JSON artifacts
GUI tests/gui/ ~10 Manual Requires QT_QPA_PLATFORM=offscreen
Property tests/property/ ~7 CI Randomized, bounded runtime
Smoke tests/smoke/ ~7 make smoke End-to-end mock pipeline
Total ~4,494

D. Glossary

Term Meaning
DrawingContext Normalized Pydantic model of a loaded DXF (entities, layers, blocks, metadata)
EntityRef Single DXF entity reference (handle, type, layer, position, text, block)
ChangeSet Batch of EditOperations from a single user prompt
OpType Edit operation type enum (13 values: move, edit_text, delete, add_block, rotate, copy, scale, mirror, add_line, add_polyline, add_circle, add_arc, add_text)
RequestClass Classification axis 1 — what: edit, analyze, compare, query, generate
ObjectiveTag Classification axis 2 — why: compliance, coordination, documentation, estimation, quality, general
StagePipelineDefinition Ordered list of StageHandlers selected by StrategyRegistry for a (RequestClass, ObjectiveTag) pair
PlatformResponse Response envelope with TaskFamily, ResponseType, RiskLevel, AuditMetadata
AgentProvider Iterative tool-use loop (max 10 turns) for complex multi-step requests
ToolExecutor Dispatches 20+ query and edit tools with protected-layer enforcement
Protected layer Layer that cannot be edited (TITLE, TITLEBLOCK, SEAL, REVISION)
TaskFamily Intent category (QNA, EDIT_PLAN, COMPARE, SUMMARY, COMPLIANCE, HEALTH, TAKEOFF, RFI, etc.)
WIF Workload Identity Federation — GCP's secretless auth for CI/CD
ADC Application Default Credentials — local GCP auth via gcloud auth application-default login
ODA Open Design Alliance File Converter — DWG → DXF conversion tool
Save-as Architectural invariant: original files are never modified; edits produce new files

E. Troubleshooting Playbooks

Tests fail with ModuleNotFoundError: No module named 'ezdxf': System pytest doesn't have project deps. Use .venv/bin/python -m pytest -v instead of bare pytest.

Cloud Run deploy fails:

  1. Check gh run list --workflow=deploy-web.yml for the failing step
  2. Verify WIF vars are set: gh variable list (should show WIF_PROVIDER, WIF_SERVICE_ACCOUNT, GCP_PROJECT_ID)
  3. Check Artifact Registry permissions: the WIF service account needs roles/artifactregistry.writer

ODA .deb download fails in CI:

  1. Check GCS bucket: gsutil ls gs://cad-dxf-agent-deps/oda/
  2. If missing, DWG support is unavailable but DXF/PDF uploads work fine
  3. Size validation catches corrupt downloads (<1MB = skip install)

User login fails:

  1. Check Firebase Auth console for the user's email
  2. Verify CAD_ALLOWED_EMAILS env var or Firestore allowlist collection includes the email
  3. Check Cloud Run logs for auth validation errors

Document persistence fails:

  1. Check GCS bucket access: gsutil ls gs://cad-dxf-agent-documents/
  2. Verify Cloud Run SA has roles/storage.objectAdmin on the bucket
  3. Check Firestore for tenant/user records

IntentCAD App Audit

Version: v0.9.0 | Date: 2026-03-09 | Status: Beta (30 epics shipped)


What It Does

IntentCAD is a Drawing Intelligence Platform for AEC professionals. Upload a DXF drawing, describe what you need in plain English — an edit, a compliance check, a quantity takeoff, a health report — and the platform classifies your intent, selects the right processing pipeline, and delivers structured results. The original file is never modified.

Supported inputs: DXF (native), DWG/PDF (via conversion pipeline).

Architecture

IntentCAD uses a two-axis intent classification system with composable stage pipelines:

Prompt → ObjectiveClassifier (RequestClass × ObjectiveTag)
       → StrategyRegistry (maps to StagePipelineDefinition)
       → StageExecutor (runs ordered stages: deterministic + LLM)
       → ResponseBuilder (PlatformResponse envelope)

For edit requests, the stage pipeline includes:

Planner → ChangeSet → Validator → Preview → EditEngine → Save-As DXF + RevisionNotes

For analysis requests (compliance, health, takeoff, summary, RFI, zones), the pipeline runs deterministic extractors without the edit flow.

For complex requests, an Agent Mode runs an iterative tool-use loop (up to 10 turns) with 20+ query and edit tools.

The LLM returns structured JSON operations — never raw DXF. Every operation is validated against safety rules before anything touches the drawing. If validation fails, the entire changeset is rejected.

Tech Stack

Layer Technology
Backend Python 3.11+, FastAPI, ezdxf
Frontend React + Vite (TypeScript)
Auth Firebase Authentication (Google Sign-In)
Hosting Firebase Hosting (frontend), Cloud Run (backend)
LLM Vertex AI — Gemini (tool-use with vision)
Storage GCS (documents), Firestore (user profiles, tenants)
Observability OpenTelemetry (console, OTLP, GCP Cloud Trace)
CI/CD GitHub Actions (auto-deploy via WIF), pre-commit hooks, ruff, mypy

Current Metrics

Metric Value
Epics completed 30
API endpoints 20+
Task families 11 (8 enabled by default)
Automated tests 4,494
Test tiers 10 (unit, integration, web, eval, live, e2e, benchmark, gui, property, smoke)
Coverage threshold 65%
Entity types supported 7 (LINE, LWPOLYLINE, TEXT, MTEXT, INSERT, CIRCLE, ARC)
Edit operation types 13
Pydantic schemas 30
Core modules 41
LLM modules 22

Capabilities

Capability Description
Edit Move, rotate, copy, scale, mirror, delete entities; add lines, polylines, circles, arcs, text, blocks
Compliance ADA/IBC/custom rule validation with findings and remediation guidance
Health Report Drawing quality metrics — layer hygiene, entity stats, potential issues
Quantity Takeoff Automated extraction of counts, lengths, areas from drawing entities
Summary Plain-English structured narrative of drawing contents
RFI Generation Automated Request For Information based on detected ambiguities
Zone Detection Closed-loop room/area detection with area calculation
Revision Comparison Diff two DXF versions, review changes, apply approved edits
Agent Mode Iterative multi-turn tool-use loop for complex requests (max 10 turns)

Two-Axis Intent Classification

Every prompt is classified on two independent axes:

  1. RequestClasswhat the user wants done: edit, analyze, compare, query, generate
  2. ObjectiveTagwhy they want it: compliance, coordination, documentation, estimation, quality, general

The StrategyRegistry maps each (RequestClass, ObjectiveTag) pair to a StagePipelineDefinition.

Security Model

  • Firebase Auth with server-side token validation and allowlist gating
  • Protected layers — TITLE, TITLEBLOCK, SEAL, REVISION are immutable; enforced at both validator and ToolExecutor levels
  • LLM isolation — the model never reads or writes DXF directly; it produces structured operations that are validated before execution
  • Save-as workflow — original files are never modified
  • Revision notes are generated deterministically from operation metadata, never from freeform LLM output
  • Tenant isolation — GCS paths scoped to documents/{tenant_id}/{user_id}/{doc_id}/
  • Session isolation — uploads scoped to per-session temp directories with 2-hour expiry

Limitations (Beta)

These are real constraints, not roadmap items dressed up as features:

  • 7 entity types — LINE, LWPOLYLINE, TEXT, MTEXT, INSERT, CIRCLE, ARC. Complex entities (SPLINE, HATCH, DIMENSION, SOLID, etc.) are skipped during load. This covers common structural/architectural drawings but not all DXF content.
  • LLM planning quality varies — simple operations (move, delete, edit text) are reliable. Complex multi-step edits on dense drawings can produce incorrect targeting.
  • Cloud-dependent for web — requires backend API for planning and execution. Desktop app runs locally.
  • Text extraction has a trust hierarchy — OCR-derived text is treated as lower confidence than native DXF text.

Development Philosophy

  • Truth over momentum — metrics are real, not aspirational. If something doesn't work, we say so.
  • Deterministic where possible — the LLM proposes, deterministic code disposes. Validation, editing, revision notes, and file I/O are all non-stochastic.
  • Structured outputs over freeform generation — the LLM returns typed operations via tool-use, not prose that gets parsed with regex.
  • Test-driven — 4,494 automated tests across 10 tiers, capability scorecard, golden trajectory regression suite.
  • Pipeline with agent escape hatch — explicit stages with clear contracts between them. Agent mode available for complex multi-step requests that need iterative tool use.
name description allowed-tools model
appaudit
Operator-grade system analysis for DevOps onboarding. Use when auditing a new codebase or creating operations playbooks.
Read,Bash,Glob,Grep,Write
claude-opus-4-5-20251101

Universal Operator-Grade System Analysis

Produce the definitive operational guide for incoming DevOps engineers. Lead with an operator-first mindset: practical, privacy-conscious, and optimized for hands-on maintainers.

Core Objective

Deliver a complete, verifiable analysis that equips DevOps engineers to:

  • Understand architecture, data flows, and business value end-to-end
  • Deploy, monitor, and troubleshoot services without supervision
  • Prioritize improvements using quantified impact and risk
  • Communicate system status credibly to stakeholders

Instructions

Phase 1: Initial Survey (30 min)

  1. Read README.md, CHANGELOG.md, project root files
  2. Check for CLAUDE.md, AGENTS.md, operational docs
  3. Identify package manifests (package.json, requirements.txt, go.mod)
  4. Scan infrastructure directories (terraform/, infrastructure/, k8s/)
  5. Find CI/CD configs (.github/workflows/, .gitlab-ci.yml)
  6. Review configuration files (.env.example, docker-compose.yml)

Phase 2: Deep Dive (60 min)

  1. Analyze source code structure and architectural patterns
  2. Review test coverage and quality metrics
  3. Examine deployment workflows and environments
  4. Study monitoring, logging, and alerting setup
  5. Assess security controls and compliance posture
  6. Document dependencies and third-party integrations

Phase 3: Synthesis (30 min)

  1. Identify gaps, risks, and immediate priorities
  2. Create actionable recommendations with timelines
  3. Generate comprehensive operations playbook

Document Numbering

Before creating output:

LAST_NUM=$(ls -1 000-docs/ 2>/dev/null | grep -E "^[0-9]{3}-" | tail -1 | cut -d'-' -f1)
NEXT_NUM=$(printf "%03d" $((10#${LAST_NUM:-0} + 1)))
# Output: 000-docs/${NEXT_NUM}-AA-AUDT-appaudit-devops-playbook.md

Output Template

Create 000-docs/NNN-AA-AUDT-appaudit-devops-playbook.md:

# [PROJECT_NAME]: Operator-Grade System Analysis
*For: DevOps Engineer*
*Generated: [Date]*
*Version: [git tag/commit]*

## 1. Executive Summary
### Business Purpose
[3-4 paragraphs: capabilities, status, tech foundation, risks]

### Operational Status Matrix
| Environment | Status | Uptime Target | Release Cadence |
|-------------|--------|---------------|-----------------|
| Production  |        |               |                 |
| Staging     |        |               |                 |

### Technology Stack
| Category | Technology | Version | Purpose |
|----------|------------|---------|---------|

## 2. System Architecture
### Technology Stack (Detailed)
| Layer | Technology | Version | Purpose | Owner |
|-------|------------|---------|---------|-------|
| Frontend | | | | |
| Backend | | | | |
| Database | | | | |
| Infrastructure | | | | |

### Architecture Diagram
[ASCII diagram showing services, data flows, failure domains]

## 3. Directory Analysis
### Project Structure
[Repository layout with purpose annotations]

### Key Directories
- **src/**: [patterns, entry points, integrations]
- **tests/**: [framework, coverage %, gaps]
- **infrastructure/**: [IaC, networking, secrets]

## 4. Operational Reference
### Deployment Workflows
#### Local Development
1. Prerequisites: [tools, versions]
2. Setup: [commands]
3. Verification: [smoke tests]

#### Production Deployment
- Pre-flight checklist
- Execution steps
- Rollback protocol

### Monitoring & Alerting
- Dashboards: [URLs]
- SLIs/SLOs: [targets]
- On-call: [rotation, escalation]

### Incident Response
| Severity | Definition | Response Time | Playbook |
|----------|------------|---------------|----------|
| P0 | System outage | Immediate | |
| P1 | Critical degradation | 15 min | |
| P2 | Partial impact | 1 hour | |

## 5. Security & Access
### IAM
| Role | Purpose | Permissions | MFA |
|------|---------|-------------|-----|

### Secrets Management
- Storage: [mechanism]
- Rotation: [policy]
- Break-glass: [procedure]

## 6. Cost & Performance
### Monthly Costs
- Compute: $X
- Storage: $X
- Databases: $X
- Total: $X

### Performance Baseline
- Latency: P50/P95/P99
- Throughput: [req/sec]
- Error budget: [%]

## 7. Current State Assessment
### What's Working[List with evidence]

### Areas Needing Attention
⚠️ [Tech debt, gaps, risks]

### Immediate Priorities
1. **[High]**[Issue] • Impact: [X] • Owner: [Y]
2. **[Medium]**[Issue] • Impact: [X] • Owner: [Y]

## 8. Quick Reference
### Command Map
| Capability | Command | Notes |
|------------|---------|-------|
| Local env | | |
| Run tests | | |
| Deploy staging | | |
| Deploy prod | | |
| View logs | | |
| Rollback | | |

### Critical URLs
- Production: [URL]
- Staging: [URL]
- Monitoring: [URL]
- CI/CD: [URL]

### First-Week Checklist
- [ ] Access granted (repos, cloud, secrets)
- [ ] Local environment working
- [ ] Completed staging deploy
- [ ] Reviewed runbooks
- [ ] Understood on-call rotation

## 9. Recommendations Roadmap
### Week 1 – Stabilization
[Goals with measurable outcomes]

### Month 1 – Foundation
[Goals with measurable outcomes]

### Quarter 1 – Strategic
[Goals with measurable outcomes]

## Appendices
- Glossary
- Reference Links
- Troubleshooting Playbooks
- Open Questions

Examples

Input: /appaudit Output: Creates 000-docs/NNN-AA-AUDT-appaudit-devops-playbook.md with full analysis.

Input: Running on a Python FastAPI project Output:

Document Created: 000-docs/042-AA-AUDT-appaudit-devops-playbook.md

Critical Findings:
1. [HIGH] No database backup automation
2. [MEDIUM] Test coverage at 34%
3. [LOW] Outdated dependencies

System Health Score: 67/100
- Architecture: 8/10
- Operations: 6/10
- Security: 7/10
- Documentation: 5/10

Immediate Actions:
1. Implement automated backups → Owner: DevOps
2. Add CI test coverage gate → Owner: Platform

Error Handling

If 000-docs/ doesn't exist:

mkdir -p 000-docs

If no package manifests found:

  • Note as gap in assessment
  • Check for alternative build systems (Makefile, scripts/)

If infrastructure directory missing:

  • Document as "Infrastructure as Code: Not implemented"
  • Add to recommendations

Writing Guidelines

Tone

  • Direct: Trust reader's expertise
  • Honest: Call out gaps and unknowns
  • Specific: Include paths, configs, metrics
  • Actionable: Pair findings with next steps

Avoid

  • Generic best practices without context
  • Tutorial-style explanations
  • Vague risk statements
  • Unverified assumptions
  • Invented commands or paths

Quality Standards

  • Validate all paths/commands against actual codebase
  • Note sources for all claims
  • Structure for easy updates
  • Target: 10,000-20,000 words
  • Success: DevOps engineer operates independently after reading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment