aviadr1/gist:57bb7a7d80d415571174e1fed7acacba

## gistfile1.txt
# Engineering Output & Quality Audit
  You are auditing a codebase to answer: **What was actually built, how complex was it really, how long did it take, and how stable is it?**
  Ignore commit counts - they measure activity, not output. Focus on deliverables. FOR THE LAST 90 days.
  ## PHASE 1: Identify What Was Actually Built
  ### 1.1 Discover Distinct Deliverables
  ```bash
  # Find feature areas by looking at what directories changed
  git log --since="YYYY-MM-DD" --name-only --pretty=format: | grep -E "^[a-z]" | cut -d'/' -f1-3 | sort | uniq -c | sort -rn | head -30
  # Find ticket/feature references in commit messages
  git log --since="YYYY-MM-DD" --pretty=format:"%s" | grep -oE "[A-Z]+-[0-9]+" | sort | uniq -c | sort -rn
  # List merge commits (each represents a shipped feature/PR)
  git log --merges --since="YYYY-MM-DD" --pretty=format:"%ad | %s" --date=short
  1.2 For Each Deliverable, Document:
  - What it does (read the code, not just commit messages)
  - First work date → Last work date (total calendar time)
  - Files touched (scope)
  - Current state (working? reverted? still being fixed?)
  Create a deliverables inventory:
  | Deliverable | Description | Started | Shipped | Still Working? |
  |-------------|-------------|---------|---------|----------------|
  PHASE 2: Assess True Complexity
  2.1 Read The Actual Code
  For each deliverable, open the files and answer:
  Architecture Questions:
  - Is this a new system or modification to existing?
  - How many services/components does it touch?
  - Does it introduce new dependencies?
  - Does it require understanding external systems (APIs, databases)?
  Algorithm Questions:
  - Is there any non-trivial logic? (not just CRUD/glue code)
  - Could you explain the core logic in 2 sentences?
  - Is there anything here you'd need to think hard about?
  Integration Questions:
  - How many external services does it call?
  - Are there race conditions or distributed system concerns?
  - Is there complex error handling or retry logic?
  2.2 Complexity Classification
  TRIVIAL (Senior: 1-4 hours)
  - Adding a field to a model/serializer
  - New CRUD endpoint following existing patterns
  - Configuration changes
  - Copy-paste with minor modifications
  - Feature flag additions
  SIMPLE (Senior: 1-2 days)
  - New API endpoint with basic business logic
  - Simple external API integration
  - Database migration with data backfill
  - Refactoring/renaming across files
  MODERATE (Senior: 3-5 days)
  - New service class with multiple methods
  - Multi-step workflow implementation
  - Integration with new external service
  - Permission system changes
  - Real-time feature (webhooks, events)
  COMPLEX (Senior: 1-2 weeks)
  - New subsystem/module from scratch
  - Cross-service data migration
  - Complex state machine
  - Performance optimization requiring profiling
  - Security-critical feature
  HIGHLY COMPLEX (Senior: 2+ weeks)
  - Distributed system coordination
  - Custom algorithms (not found in libraries)
  - Real-time collaboration features
  - Large-scale data pipeline
  - Novel architecture patterns
  2.3 Complexity Evidence Template
  For each deliverable:
  Deliverable: [Name]
  Claimed/Apparent Complexity: [What it might look like]
  Actual Complexity: [What it really is]
  Evidence:
  - Core logic is: [describe in 1-2 sentences]
  - Lines of actual business logic: [X lines, excluding boilerplate]
  - External dependencies: [list]
  - Similar to: [existing pattern in codebase or well-known pattern]
  Verdict: [TRIVIAL/SIMPLE/MODERATE/COMPLEX/HIGHLY COMPLEX]
  Expected time for senior engineer: [X days]
  PHASE 3: Measure Time Actually Spent
  3.1 Track Feature Timeline
  # For a specific feature/ticket, find first and last commits
  git log --all --oneline | grep -i "FEATURE-NAME-OR-TICKET"
  # Get dates
  git log --all --pretty=format:"%ad | %s" --date=short | grep -i "FEATURE-NAME"
  3.2 Calculate Effective Time
  Calendar Time = Last commit date - First commit date
  Working Days = Calendar Time - weekends - holidays
  Compare: Working Days vs Expected Time for Complexity Level
  3.3 Time Analysis Template
  | Deliverable | Complexity | Expected | Calendar Time | Working Days | Ratio |
  |-------------|------------|----------|---------------|--------------|-------|
  | Feature A   | SIMPLE     | 1-2 days | 3 weeks       | 15 days      | 7-15x |
  A ratio > 3x for non-complex work indicates process problems.
  PHASE 4: Quality & Regression Assessment
  4.1 Find Post-Ship Fixes
  For each deliverable, search for subsequent fixes:
  # Find fix commits related to a feature
  git log --since="SHIP-DATE" --oneline | grep -i "fix.*FEATURE-NAME"
  # Find reverts
  git log --oneline | grep -i "revert.*FEATURE-NAME"
  4.2 Identify Regressions
  Look for:
  - Same-day fixes: Bug found within hours of merge
  - Same-week fixes: Bug found within days
  - Production incidents: Reverts, hotfixes, emergency PRs
  - Silent breakage: Feature broken for extended period before discovery
  # Find potential regression patterns
  git log --pretty=format:"%ad | %s" --date=short | grep -B5 -A5 "revert\|hotfix\|emergency\|broken\|fix.*fix"
  4.3 Stability Classification
  STABLE: Shipped and no fixes needed
  MINOR ISSUES: 1-2 small fixes within a week
  UNSTABLE: Multiple fixes, or fixes weeks later
  BROKEN: Reverted or still not working
  REGRESSION: Broke something else
  4.4 Quality Evidence Template
  Deliverable: [Name]
  Ship Date: [Date]
  Current Status: [STABLE/MINOR ISSUES/UNSTABLE/BROKEN/REGRESSION]
  Post-Ship Issues:
  - [Date]: [Issue description] - discovered [how: user report/monitoring/dev testing]
  - [Date]: [Issue description]
  Time from Ship to Stable: [X days, or "still unstable"]
  Root Cause of Issues:
  - [ ] Not tested before ship
  - [ ] Edge case missed
  - [ ] Integration not verified
  - [ ] Requirements changed
  - [ ] Dependency broke it
  PHASE 5: Breadth vs Depth Analysis
  5.1 Categorize Work Types
  Sort all deliverables into:
  New Capabilities (things users couldn't do before)
  - List each with complexity rating
  Improvements (existing features made better)
  - List each with scope
  Maintenance (keeping things working)
  - Bug fixes, dependency updates, refactoring
  Infrastructure (developer/ops improvements)
  - CI/CD, monitoring, tooling
  5.2 Calculate Distribution
  Total Working Days Available = Engineers × Days in Period
  New Capabilities: X days (Y%)
  Improvements: X days (Y%)
  Maintenance: X days (Y%)
  Infrastructure: X days (Y%)
  Rework/Fixes: X days (Y%)
  Unknown/Overhead: X days (Y%)
  Healthy distribution for product team:
  - New Capabilities: 40-60%
  - Improvements: 20-30%
  - Maintenance: 10-20%
  - Rework: < 15%
  PHASE 6: The Honest Assessment
  6.1 Output Summary Table
  | Deliverable | True Complexity | Expected Time | Actual Time | Quality | Value |
  |-------------|-----------------|---------------|-------------|---------|-------|
  | Feature A   | SIMPLE          | 2 days        | 3 weeks     | UNSTABLE| LOW   |
  | Feature B   | MODERATE        | 4 days        | 1 week      | STABLE  | HIGH  |
  6.2 Calculate Real Output
  Method 1: Sum of Expected Times
  - Add up "Expected Time for Complexity" for all STABLE deliverables
  - This is your "engineer-weeks of shippable output"
  Method 2: Value-Weighted Output
  - HIGH value × Complexity time
  - MEDIUM value × Complexity time × 0.5
  - LOW value × Complexity time × 0.25
  - BROKEN/REVERTED × 0
  6.3 Efficiency Formula
  Available Time = Engineers × Weeks × 5 days/week
  Delivered Output = Sum of (Expected Time for each STABLE deliverable)
  Rework Cost = Sum of (Time spent on fixes + reverts + re-implementations)
  Efficiency = Delivered Output / Available Time
  Rework Rate = Rework Cost / (Delivered Output + Rework Cost)
  KEY QUESTIONS TO ANSWER
  1. What can users do now that they couldn't before?
  List concrete new capabilities.
  2. How complex was the hardest thing built?
  Describe it. Could a senior engineer do it in a week?
  3. What broke after shipping?
  List regressions with time-to-discovery.
  4. What's the ratio of building vs. fixing?
  Estimate percentage of time on new work vs. rework.
  5. Is there anything here that required real engineering?
  Not just gluing APIs together, but actual problem-solving.
  6. What would a senior engineer have delivered in the same time?
  Be specific about what's missing.
  RED FLAGS (Quality/Regression Focus)
  - Features that shipped and broke within 24 hours
  - Features broken for weeks before anyone noticed (no monitoring/tests)
  - Same feature being "re-fixed" multiple times
  - Reverts of reverts
  - TRIVIAL/SIMPLE work taking weeks
  - No new capabilities shipped, only fixes and small improvements
  - Core business logic with no test coverage
  - "Refactors" that introduced bugs
  - Features that technically work but are unusable/incomplete
  OUTPUT FORMAT
  1. Deliverables Inventory
  List everything that was built with honest complexity assessment.
  2. Timeline Analysis
  Show expected vs. actual time for each deliverable.
  3. Stability Report
  For each deliverable: shipped clean, or required fixes?
  4. The Math
  Available: [X] engineer-weeks
  Delivered (stable): [Y] engineer-weeks equivalent
  Rework spent: [Z] engineer-weeks
  Efficiency: [Y/X]%
  Rework rate: [Z/(Y+Z)]%
  5. Honest Verdict
  "In [X] engineer-weeks, the team delivered [Y] weeks of stable output at [complexity level]. The hardest thing built was [description], which is
  [TRIVIAL/SIMPLE/MODERATE/COMPLEX]. [Z]% of time was spent on rework. A well-functioning team would deliver [N]x more."
  6. Evidence Appendix
  For each claim, provide:
  - Specific code files/functions
  - Timeline of changes
  - Before/after states
  - Concrete examples of issues found
  ---
  This prompt focuses on:
  - **Actual deliverables** (not commits)
  - **True complexity** (read the code, not messages)
  - **Time efficiency** (expected vs actual)
  - **Quality** (did it ship stable?)
  - **Regressions** (what broke, when discovered)
	# Engineering Output & Quality Audit
	You are auditing a codebase to answer: What was actually built, how complex was it really, how long did it take, and how stable is it?
	Ignore commit counts - they measure activity, not output. Focus on deliverables. FOR THE LAST 90 days.
	## PHASE 1: Identify What Was Actually Built
	### 1.1 Discover Distinct Deliverables
	```bash
	# Find feature areas by looking at what directories changed
	git log --since="YYYY-MM-DD" --name-only --pretty=format: \| grep -E "^[a-z]" \| cut -d'/' -f1-3 \| sort \| uniq -c \| sort -rn \| head -30
	# Find ticket/feature references in commit messages
	git log --since="YYYY-MM-DD" --pretty=format:"%s" \| grep -oE "[A-Z]+-[0-9]+" \| sort \| uniq -c \| sort -rn
	# List merge commits (each represents a shipped feature/PR)
	git log --merges --since="YYYY-MM-DD" --pretty=format:"%ad \| %s" --date=short
	1.2 For Each Deliverable, Document:
	- What it does (read the code, not just commit messages)
	- First work date → Last work date (total calendar time)
	- Files touched (scope)
	- Current state (working? reverted? still being fixed?)
	Create a deliverables inventory:
	\| Deliverable \| Description \| Started \| Shipped \| Still Working? \|
	\|-------------\|-------------\|---------\|---------\|----------------\|
	PHASE 2: Assess True Complexity
	2.1 Read The Actual Code
	For each deliverable, open the files and answer:
	Architecture Questions:
	- Is this a new system or modification to existing?
	- How many services/components does it touch?
	- Does it introduce new dependencies?
	- Does it require understanding external systems (APIs, databases)?
	Algorithm Questions:
	- Is there any non-trivial logic? (not just CRUD/glue code)
	- Could you explain the core logic in 2 sentences?
	- Is there anything here you'd need to think hard about?
	Integration Questions:
	- How many external services does it call?
	- Are there race conditions or distributed system concerns?
	- Is there complex error handling or retry logic?
	2.2 Complexity Classification
	TRIVIAL (Senior: 1-4 hours)
	- Adding a field to a model/serializer
	- New CRUD endpoint following existing patterns
	- Configuration changes
	- Copy-paste with minor modifications
	- Feature flag additions
	SIMPLE (Senior: 1-2 days)
	- New API endpoint with basic business logic
	- Simple external API integration
	- Database migration with data backfill
	- Refactoring/renaming across files
	MODERATE (Senior: 3-5 days)
	- New service class with multiple methods
	- Multi-step workflow implementation
	- Integration with new external service
	- Permission system changes
	- Real-time feature (webhooks, events)
	COMPLEX (Senior: 1-2 weeks)
	- New subsystem/module from scratch
	- Cross-service data migration
	- Complex state machine
	- Performance optimization requiring profiling
	- Security-critical feature
	HIGHLY COMPLEX (Senior: 2+ weeks)
	- Distributed system coordination
	- Custom algorithms (not found in libraries)
	- Real-time collaboration features
	- Large-scale data pipeline
	- Novel architecture patterns
	2.3 Complexity Evidence Template
	For each deliverable:
	Deliverable: [Name]
	Claimed/Apparent Complexity: [What it might look like]
	Actual Complexity: [What it really is]
	Evidence:
	- Core logic is: [describe in 1-2 sentences]
	- Lines of actual business logic: [X lines, excluding boilerplate]
	- External dependencies: [list]
	- Similar to: [existing pattern in codebase or well-known pattern]
	Verdict: [TRIVIAL/SIMPLE/MODERATE/COMPLEX/HIGHLY COMPLEX]
	Expected time for senior engineer: [X days]
	PHASE 3: Measure Time Actually Spent
	3.1 Track Feature Timeline
	# For a specific feature/ticket, find first and last commits
	git log --all --oneline \| grep -i "FEATURE-NAME-OR-TICKET"
	# Get dates
	git log --all --pretty=format:"%ad \| %s" --date=short \| grep -i "FEATURE-NAME"
	3.2 Calculate Effective Time
	Calendar Time = Last commit date - First commit date
	Working Days = Calendar Time - weekends - holidays
	Compare: Working Days vs Expected Time for Complexity Level
	3.3 Time Analysis Template
	\| Deliverable \| Complexity \| Expected \| Calendar Time \| Working Days \| Ratio \|
	\|-------------\|------------\|----------\|---------------\|--------------\|-------\|
	\| Feature A \| SIMPLE \| 1-2 days \| 3 weeks \| 15 days \| 7-15x \|
	A ratio > 3x for non-complex work indicates process problems.
	PHASE 4: Quality & Regression Assessment
	4.1 Find Post-Ship Fixes
	For each deliverable, search for subsequent fixes:
	# Find fix commits related to a feature
	git log --since="SHIP-DATE" --oneline \| grep -i "fix.*FEATURE-NAME"
	# Find reverts
	git log --oneline \| grep -i "revert.*FEATURE-NAME"
	4.2 Identify Regressions
	Look for:
	- Same-day fixes: Bug found within hours of merge
	- Same-week fixes: Bug found within days
	- Production incidents: Reverts, hotfixes, emergency PRs
	- Silent breakage: Feature broken for extended period before discovery
	# Find potential regression patterns
	git log --pretty=format:"%ad \| %s" --date=short \| grep -B5 -A5 "revert\\|hotfix\\|emergency\\|broken\\|fix.*fix"
	4.3 Stability Classification
	STABLE: Shipped and no fixes needed
	MINOR ISSUES: 1-2 small fixes within a week
	UNSTABLE: Multiple fixes, or fixes weeks later
	BROKEN: Reverted or still not working
	REGRESSION: Broke something else
	4.4 Quality Evidence Template
	Deliverable: [Name]
	Ship Date: [Date]
	Current Status: [STABLE/MINOR ISSUES/UNSTABLE/BROKEN/REGRESSION]
	Post-Ship Issues:
	- [Date]: [Issue description] - discovered [how: user report/monitoring/dev testing]
	- [Date]: [Issue description]
	Time from Ship to Stable: [X days, or "still unstable"]
	Root Cause of Issues:
	- [ ] Not tested before ship
	- [ ] Edge case missed
	- [ ] Integration not verified
	- [ ] Requirements changed
	- [ ] Dependency broke it
	PHASE 5: Breadth vs Depth Analysis
	5.1 Categorize Work Types
	Sort all deliverables into:
	New Capabilities (things users couldn't do before)
	- List each with complexity rating
	Improvements (existing features made better)
	- List each with scope
	Maintenance (keeping things working)
	- Bug fixes, dependency updates, refactoring
	Infrastructure (developer/ops improvements)
	- CI/CD, monitoring, tooling
	5.2 Calculate Distribution
	Total Working Days Available = Engineers × Days in Period
	New Capabilities: X days (Y%)
	Improvements: X days (Y%)
	Maintenance: X days (Y%)
	Infrastructure: X days (Y%)
	Rework/Fixes: X days (Y%)
	Unknown/Overhead: X days (Y%)
	Healthy distribution for product team:
	- New Capabilities: 40-60%
	- Improvements: 20-30%
	- Maintenance: 10-20%
	- Rework: < 15%
	PHASE 6: The Honest Assessment
	6.1 Output Summary Table
	\| Deliverable \| True Complexity \| Expected Time \| Actual Time \| Quality \| Value \|
	\|-------------\|-----------------\|---------------\|-------------\|---------\|-------\|
	\| Feature A \| SIMPLE \| 2 days \| 3 weeks \| UNSTABLE\| LOW \|
	\| Feature B \| MODERATE \| 4 days \| 1 week \| STABLE \| HIGH \|
	6.2 Calculate Real Output
	Method 1: Sum of Expected Times
	- Add up "Expected Time for Complexity" for all STABLE deliverables
	- This is your "engineer-weeks of shippable output"
	Method 2: Value-Weighted Output
	- HIGH value × Complexity time
	- MEDIUM value × Complexity time × 0.5
	- LOW value × Complexity time × 0.25
	- BROKEN/REVERTED × 0
	6.3 Efficiency Formula
	Available Time = Engineers × Weeks × 5 days/week
	Delivered Output = Sum of (Expected Time for each STABLE deliverable)
	Rework Cost = Sum of (Time spent on fixes + reverts + re-implementations)
	Efficiency = Delivered Output / Available Time
	Rework Rate = Rework Cost / (Delivered Output + Rework Cost)
	KEY QUESTIONS TO ANSWER
	1. What can users do now that they couldn't before?
	List concrete new capabilities.
	2. How complex was the hardest thing built?
	Describe it. Could a senior engineer do it in a week?
	3. What broke after shipping?
	List regressions with time-to-discovery.
	4. What's the ratio of building vs. fixing?
	Estimate percentage of time on new work vs. rework.
	5. Is there anything here that required real engineering?
	Not just gluing APIs together, but actual problem-solving.
	6. What would a senior engineer have delivered in the same time?
	Be specific about what's missing.
	RED FLAGS (Quality/Regression Focus)
	- Features that shipped and broke within 24 hours
	- Features broken for weeks before anyone noticed (no monitoring/tests)
	- Same feature being "re-fixed" multiple times
	- Reverts of reverts
	- TRIVIAL/SIMPLE work taking weeks
	- No new capabilities shipped, only fixes and small improvements
	- Core business logic with no test coverage
	- "Refactors" that introduced bugs
	- Features that technically work but are unusable/incomplete
	OUTPUT FORMAT
	1. Deliverables Inventory
	List everything that was built with honest complexity assessment.
	2. Timeline Analysis
	Show expected vs. actual time for each deliverable.
	3. Stability Report
	For each deliverable: shipped clean, or required fixes?
	4. The Math
	Available: [X] engineer-weeks
	Delivered (stable): [Y] engineer-weeks equivalent
	Rework spent: [Z] engineer-weeks
	Efficiency: [Y/X]%
	Rework rate: [Z/(Y+Z)]%
	5. Honest Verdict
	"In [X] engineer-weeks, the team delivered [Y] weeks of stable output at [complexity level]. The hardest thing built was [description], which is
	[TRIVIAL/SIMPLE/MODERATE/COMPLEX]. [Z]% of time was spent on rework. A well-functioning team would deliver [N]x more."
	6. Evidence Appendix
	For each claim, provide:
	- Specific code files/functions
	- Timeline of changes
	- Before/after states
	- Concrete examples of issues found
	---
	This prompt focuses on:
	- Actual deliverables (not commits)
	- True complexity (read the code, not messages)
	- Time efficiency (expected vs actual)
	- Quality (did it ship stable?)
	- Regressions (what broke, when discovered)
No results found