ikouchiha47/coding_agent.md

## coding_agent.md

      
    Raw
  

              coding_agent.md
            
          
    AI Agent Development & Debugging Guide

1. Development Cycle

Test-Driven Development (TDD) Workflow

Phase 1: Write Tests First

Write a failing test that defines the expected behavior before implementing any code.
# Python example
def test_create_user_with_valid_email():
    user = user_service.create(email='test@example.com')
    assert user.email == 'test@example.com'
    assert user.id is not None
// Go example
func TestCreateUserWithValidEmail(t *testing.T) {
    user, err := userService.Create("test@example.com")
    assert.NoError(t, err)
    assert.Equal(t, "test@example.com", user.Email)
    assert.NotEmpty(t, user.ID)
}
Phase 2: Define Interfaces/Contracts

Define clear contracts before implementation. Use your language's interface/protocol/trait system.
# Python - Protocol/ABC
from typing import Protocol

class UserService(Protocol):
    def create(self, email: str) -> User: ...
    def find_by_id(self, user_id: str) -> User | None: ...
    def update(self, user_id: str, data: dict) -> User: ...
    def delete(self, user_id: str) -> None: ...
// Go - Interface
type UserService interface {
    Create(email string) (*User, error)
    FindByID(id string) (*User, error)
    Update(id string, data map[string]interface{}) (*User, error)
    Delete(id string) error
}
// Java - Interface
public interface UserService {
    User create(String email) throws ServiceException;
    Optional<User> findById(String id);
    User update(String id, Map<String, Object> data);
    void delete(String id);
}
Phase 3: Create Mock Implementation for Testing

Build a simple in-memory mock that satisfies the interface for fast testing.
class MockUserService:
    def __init__(self):
        self.users = {}
    
    def create(self, email: str) -> User:
        user = User(id=generate_id(), email=email)
        self.users[user.id] = user
        return user
type MockUserService struct {
    users map[string]*User
}

func (m *MockUserService) Create(email string) (*User, error) {
    user := &User{ID: generateID(), Email: email}
    m.users[user.ID] = user
    return user, nil
}
Phase 4: Implement Real Implementation

Now implement the actual service with real dependencies (database, APIs, etc).
class DatabaseUserService:
    def __init__(self, db: Database):
        self.db = db
    
    def create(self, email: str) -> User:
        return self.db.insert('users', {'email': email})
type DatabaseUserService struct {
    db *Database
}

func (s *DatabaseUserService) Create(email string) (*User, error) {
    return s.db.Insert("users", map[string]interface{}{"email": email})
}
Phase 5: Verify Tests Pass

Run your test suite to verify implementation correctness.
# Python
pytest tests/test_user_service.py -v
pytest --cov=user_service

# Go
go test ./... -v
go test -cover ./...

# Java
mvn test
gradle test

# Rust
cargo test
cargo test --verbose
Development Best Practices

1. Research Before Implementing

Evaluate Multiple Approaches:
Before writing any code, explore different solutions:
# Research existing patterns in codebase
git grep -n "similar_pattern" 
rg "authentication" --type py  # Find existing auth implementations

# Check what approaches exist
# Example: "How to handle file uploads in large projects?"
# - Streaming vs buffering
# - Direct storage vs queue-based
# - Library vs custom implementation
Questions to Answer:

How is similar functionality already implemented in this codebase?
What are 3 different approaches to solve this?
Which approach fits the existing architecture?
Which is most maintainable long-term?
Which is most extensible for future changes?

Check External Resources:
# GitHub Issues for similar problems
# Search: "<library-name> <your-problem>" on GitHub Issues
# Example: "fastapi file upload large files"

# StackOverflow for real-world solutions
# Look for answers with multiple upvotes and recent activity
# Check comments for gotchas and edge cases

# Documentation examples
# Always check official docs first - they show intended patterns
2. Library Evaluation Checklist

Before adding any dependency, evaluate:
Health Indicators:
# Check GitHub repository
- Last commit date (< 6 months ago is good)
- Open issues count
- Issues closed vs open ratio (should be > 2:1)
- Pull request merge time (< 1 month is good)
- Number of contributors (more = better)
- Stars and forks (popularity indicator)

# Check package registry
npm info <package>        # npm
pip show <package>        # PyPI
go list -m <package>      # Go modules
cargo search <package>    # crates.io
Evaluation Template:


Criteria
Threshold
Notes


Last commit
< 6 months
Is it actively maintained?


Open issues
< 100 for small libs, < 500 for large
Are issues being addressed?


Issue close rate
> 60%
Do maintainers respond?


Dependencies
< 10 direct deps
Fewer dependencies = less risk


Bundle size
Check for your use case
Will it bloat the build?


Breaking changes
Check changelog
How often do they break APIs?


Community feedback
Search " vs "
What do real users say?


Red Flags:

No commits in > 1 year
More open issues than closed
Many unresolved security advisories
Abandoned by maintainer (check for notices)
Heavy dependencies for simple tasks

Example Evaluation:
# Evaluating a CSV parsing library
npm info papaparse
# - Last publish: 2 months ago ✓
# - Downloads: 5M/week ✓
# - Dependencies: 0 ✓
# - Issues: 180 open, 520 closed (74% close rate) ✓

# Compare alternatives
npm info csv-parser
# - Last publish: 3 years ago ✗
# - Downloads: 1M/week
# - Issues: 45 open, 30 closed (40% close rate) ✗

# Decision: Use papaparse (active maintenance, proven track record)
3. Decision Tracking with SQLite

IMPORTANT: Store all architectural and library decisions in a local SQLite database for future reference and pattern matching.
Setup (one-time):
# Install required tools
pip install sqlite-utils sqlite-utils-sqlite-vec

# Create decisions database
sqlite-utils create-database .dev/decisions.db

# Create tables
sqlite-utils create-table .dev/decisions.db decisions \
  id integer \
  timestamp text \
  category text \
  decision text \
  reasoning text \
  alternatives text \
  outcome text \
  --pk id

sqlite-utils create-table .dev/decisions.db library_evaluations \
  id integer \
  library_name text \
  version text \
  purpose text \
  health_score integer \
  last_commit text \
  issues_ratio real \
  decision text \
  notes text \
  --pk id

# Enable vector search for similarity
sqlite-utils install sqlite-vec
Recording Decisions:
# Record architectural decision
sqlite-utils insert .dev/decisions.db decisions \
  --csv << EOF
timestamp,category,decision,reasoning,alternatives,outcome
2025-10-24T10:30:00,architecture,Use event-driven architecture for order processing,"Decouples services, enables async processing, easier to scale","Monolithic approach (too coupled), Microservices (too complex)",Implemented successfully
EOF

# Record library evaluation
sqlite-utils insert .dev/decisions.db library_evaluations \
  --csv << EOF
library_name,version,purpose,health_score,last_commit,issues_ratio,decision,notes
papaparse,5.4.1,CSV parsing,9,2025-08-15,0.74,approved,Zero dependencies and active maintenance
csv-parser,3.0.0,CSV parsing,4,2022-05-10,0.40,rejected,Abandoned by maintainer
EOF
Querying Past Decisions:
# Find similar past decisions
sqlite-utils query .dev/decisions.db \
  "SELECT * FROM decisions WHERE category = 'authentication' ORDER BY timestamp DESC LIMIT 5"

# Check if we've evaluated a library before
sqlite-utils query .dev/decisions.db \
  "SELECT * FROM library_evaluations WHERE library_name LIKE '%upload%'"

# Find decisions that didn't work out
sqlite-utils query .dev/decisions.db \
  "SELECT * FROM decisions WHERE outcome LIKE '%reverted%' OR outcome LIKE '%failed%'"

# Get all libraries we rejected and why
sqlite-utils query .dev/decisions.db \
  "SELECT library_name, decision, notes FROM library_evaluations WHERE decision = 'rejected'"
Using Vector Search for Similar Problems:
# Script: .dev/find_similar_decisions.py
import sqlite3
import sqlite_vec
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')

def find_similar_decisions(query: str, limit: int = 5):
    """Find past decisions similar to current problem"""
    db = sqlite3.connect('.dev/decisions.db')
    db.enable_load_extension(True)
    sqlite_vec.load(db)
    
    # Generate embedding for query
    query_embedding = model.encode(query)
    
    # Search for similar decisions
    cursor = db.execute("""
        SELECT decision, reasoning, outcome, 
               vec_distance_cosine(embedding, ?) as similarity
        FROM decisions
        ORDER BY similarity ASC
        LIMIT ?
    """, (query_embedding.tobytes(), limit))
    
    return cursor.fetchall()

# Usage
similar = find_similar_decisions("How to handle large file uploads?")
for decision, reasoning, outcome, similarity in similar:
    print(f"Similarity: {similarity:.2f}")
    print(f"Decision: {decision}")
    print(f"Outcome: {outcome}\n")
Decision Template (save as .dev/DECISION_TEMPLATE.md):
# Decision: [Title]

**Date:** YYYY-MM-DD
**Category:** [architecture|library|pattern|tooling]
**Status:** [proposed|accepted|rejected|superseded]

## Context
What problem are we trying to solve?

## Decision
What did we decide to do?

## Alternatives Considered
1. **Alternative 1:** Brief description
   - Pros: ...
   - Cons: ...
   
2. **Alternative 2:** Brief description
   - Pros: ...
   - Cons: ...

## Reasoning
Why did we choose this approach?
- Fits existing architecture because...
- Most maintainable because...
- Most extensible because...

## Consequences
What are the implications?
- Positive: ...
- Negative: ...
- Risks: ...

## References
- GitHub Issue: #123
- StackOverflow: [link]
- Documentation: [link]

## Outcome (update later)
How did this decision work out?
4. Contract-First Design


Define interfaces/protocols/traits before implementation
Use dependency injection with contracts
Mock contracts for fast unit tests
Keep contracts focused and cohesive

5. Incremental Implementation Pattern


Research - Find 3 approaches, evaluate against codebase
Decide - Record decision in decisions.db
Define the contract - What operations does this component expose?
Build a mock - Simple in-memory implementation for testing
Write tests - Against the contract, using the mock
Implement real version - Satisfies same contract
Review outcome - Update decisions.db with results

6. Test Coverage Goals


Unit tests: 80%+ coverage of business logic
Integration tests: Critical paths and data flows
E2E tests: Key user journeys
Run tests before commits (pre-commit hooks)


2. Debugging Cycle

The Core Principle

Logs reveal reality. Assumptions hide bugs. Always verify with actual data.
Simple Debugging Process

Step 1: Reproduce the Bug

# Document exact steps to trigger the bug
# Create a simple script or command that fails consistently

# Example: API endpoint fails
curl -X POST http://localhost:3000/api/orders \
  -H "Content-Type: application/json" \
  -d '{"userId": "123", "items": [{"id": "abc"}]}'

# Example: CLI command fails
npm run process-video -- --id=video_123

# If it's flaky, note the failure rate
# Try 10 times: 3 failures = 30% failure rate
Step 2: Add Logging to See What's Actually Happening

// Add logs at key points to see the ACTUAL data
async function processOrder(orderId: string) {
  console.log('[ORDER] Starting processOrder:', { orderId, timestamp: new Date() });
  
  const order = await getOrder(orderId);
  console.log('[ORDER] Fetched order:', { order });  // SEE ACTUAL ORDER DATA
  
  const items = await getOrderItems(order.id);
  console.log('[ORDER] Fetched items:', { itemCount: items.length, items });
  
  if (items.length === 0) {
    console.error('[ORDER] ERROR: No items found for order:', { orderId, order });
    throw new Error('Order has no items');
  }
  
  const validated = validateItems(items);
  console.log('[ORDER] Validation result:', { validated, itemCount: items.length });
  
  return validated;
}
Logging Best Practices:

Use consistent prefixes like [MODULE] for easy filtering
Log input parameters at function entry
Log intermediate results at each step
Log the actual values, not just "processing..."
Include context: IDs, counts, timestamps
Log before throwing errors

Step 3: Run the Code and Read the Logs

# Run and capture all output
npm run dev 2>&1 | tee debug.log

# Filter to your module
grep "\[ORDER\]" debug.log

# Find errors
grep -i "error\|exception\|failed" debug.log

# Get context around an error (10 lines before and after)
grep -B 10 -A 10 "No items found" debug.log
Step 4: Analyze What the Logs Show

Look for:

Unexpected values: Is order actually null? Is items.length really 0?
Wrong data types: Is orderId a string when it should be a number?
Missing data: Are fields undefined that you expected?
Timing issues: Does one thing happen before another when it should be after?

# Example log analysis showing the bug
[ORDER] Starting processOrder: { orderId: 'order_123' }
[ORDER] Fetched order: { order: { id: 'order_123', userId: 'user_456' } }
[ORDER] Fetched items: { itemCount: 0, items: [] }  # ← BUG: Why 0 items?
[ORDER] ERROR: No items found for order: { orderId: 'order_123' }
Step 5: Check Your Assumptions with Direct Queries

# Don't trust the code - verify directly in the database
sqlite3 database.db "SELECT * FROM order_items WHERE order_id = 'order_123';"

# Result shows 3 items exist!
# So the code is querying wrong...

# Check what the code is actually querying
# Add logging to the getOrderItems function
console.log('[DB] Query:', { sql, params });
// Example bug found: querying with wrong field
async function getOrderItems(orderId: string) {
  const sql = 'SELECT * FROM order_items WHERE id = ?';  // ← BUG: should be "order_id"
  console.log('[DB] Query:', { sql, params: [orderId] });
  return db.all(sql, [orderId]);
}
Step 6: Fix Based on What You Found

// Fix the actual problem found in logs
async function getOrderItems(orderId: string) {
  const sql = 'SELECT * FROM order_items WHERE order_id = ?';  // ✓ Fixed
  console.log('[DB] Query:', { sql, params: [orderId] });
  return db.all(sql, [orderId]);
}
Step 7: Verify the Fix Works

# Run the reproduction steps again
npm run process-order -- --id=order_123

# Check logs show correct behavior
grep "\[ORDER\]" debug.log
# Should now show: itemCount: 3, items: [...]

# Run tests
npm test
Step 8: Add a Test to Prevent Regression

// Write test that would have caught this bug
describe('getOrderItems', () => {
  it('should fetch items by order_id not id', async () => {
    await db.insert('order_items', { id: 'item_1', order_id: 'order_123' });
    
    const items = await getOrderItems('order_123');
    
    expect(items).toHaveLength(1);
    expect(items[0].id).toBe('item_1');
  });
});
When to Use Git History

Use git when you need to understand when and why code changed:
Find When Code Last Changed

# See recent commits affecting this file
git log --oneline -10 -- src/services/order.ts

# See actual changes
git log -p -- src/services/order.ts | less

# Find who changed a specific line
git blame src/services/order.ts | grep "getOrderItems"

# See commits from last week
git log --since="1 week ago" --oneline
Compare With Working Version

# Compare current code with last release
git diff v1.2.0 -- src/services/order.ts

# Compare with specific commit
git diff abc123 -- src/services/order.ts

# Show file contents from previous commit
git show HEAD~1:src/services/order.ts
Find the Commit That Broke Things (Binary Search)

# Only use this if you know when it worked before
git bisect start
git bisect bad HEAD              # Current version is broken
git bisect good v1.2.0           # This version worked

# Git will checkout commits for you to test
# Run your test at each commit
npm test

# Tell git if this commit is good or bad
git bisect good   # Test passed
git bisect bad    # Test failed

# Git finds the breaking commit
# "abc123 is the first bad commit"

git bisect reset  # Return to original state
Common Bug Patterns

Pattern 1: Wrong Parameter Passed

// Function expects video ID
async function getVideoSegments(videoId: string) {
  return db.query('SELECT * FROM segments WHERE video_id = ?', [videoId]);
}

// But called with job ID instead
const segments = await getVideoSegments(job.id);  // ❌ Wrong ID type

// Fix: Pass correct ID
const segments = await getVideoSegments(job.videoId);  // ✓
Pattern 2: Assuming Data Exists

// Code assumes user always exists
const userName = user.name.toUpperCase();  // ❌ Crashes if user is null

// Fix: Check first
const userName = user?.name?.toUpperCase() ?? 'Unknown';  // ✓
Pattern 3: Wrong Query Field

// Querying wrong column
db.query('SELECT * FROM users WHERE id = ?', [email]);  // ❌ id vs email

// Fix: Use correct column
db.query('SELECT * FROM users WHERE email = ?', [email]);  // ✓
Debugging Checklist

When stuck on a bug:

 Can I reproduce it consistently?
 Have I added logs to see actual data?
 Have I run the code and read the logs?
 Do the logs show unexpected values?
 Have I verified my assumptions with direct queries?
 Have I checked if the data actually exists?
 Have I checked the function parameters match what's expected?
 Have I looked at recent git changes to this code?
 Have I written a test that reproduces the bug?

Remember

The Debugging Mantra:

Add logs - You can't debug what you can't see
Run code - See what actually happens
Read logs - Find the unexpected values
Verify assumptions - Check the database/files directly
Fix the real problem - Not what you think the problem is
Add a test - Prevent it from happening again

Never trust assumptions. Always verify with actual data.
Instructions are language agnostic
Criteria	Threshold	Notes
Last commit	< 6 months	Is it actively maintained?
Open issues	< 100 for small libs, < 500 for large	Are issues being addressed?
Issue close rate	> 60%	Do maintainers respond?
Dependencies	< 10 direct deps	Fewer dependencies = less risk
Bundle size	Check for your use case	Will it bloat the build?
Breaking changes	Check changelog	How often do they break APIs?
Community feedback	Search " vs "	What do real users say?
No results found