Skip to content

Instantly share code, notes, and snippets.

@ikouchiha47
Created October 24, 2025 10:41
Show Gist options
  • Select an option

  • Save ikouchiha47/7b74d669eefacea54485c1b7cf54742f to your computer and use it in GitHub Desktop.

Select an option

Save ikouchiha47/7b74d669eefacea54485c1b7cf54742f to your computer and use it in GitHub Desktop.
coding agent

AI Agent Development & Debugging Guide

1. Development Cycle

Test-Driven Development (TDD) Workflow

Phase 1: Write Tests First

Write a failing test that defines the expected behavior before implementing any code.

# Python example
def test_create_user_with_valid_email():
    user = user_service.create(email='test@example.com')
    assert user.email == 'test@example.com'
    assert user.id is not None
// Go example
func TestCreateUserWithValidEmail(t *testing.T) {
    user, err := userService.Create("test@example.com")
    assert.NoError(t, err)
    assert.Equal(t, "test@example.com", user.Email)
    assert.NotEmpty(t, user.ID)
}

Phase 2: Define Interfaces/Contracts

Define clear contracts before implementation. Use your language's interface/protocol/trait system.

# Python - Protocol/ABC
from typing import Protocol

class UserService(Protocol):
    def create(self, email: str) -> User: ...
    def find_by_id(self, user_id: str) -> User | None: ...
    def update(self, user_id: str, data: dict) -> User: ...
    def delete(self, user_id: str) -> None: ...
// Go - Interface
type UserService interface {
    Create(email string) (*User, error)
    FindByID(id string) (*User, error)
    Update(id string, data map[string]interface{}) (*User, error)
    Delete(id string) error
}
// Java - Interface
public interface UserService {
    User create(String email) throws ServiceException;
    Optional<User> findById(String id);
    User update(String id, Map<String, Object> data);
    void delete(String id);
}

Phase 3: Create Mock Implementation for Testing

Build a simple in-memory mock that satisfies the interface for fast testing.

class MockUserService:
    def __init__(self):
        self.users = {}
    
    def create(self, email: str) -> User:
        user = User(id=generate_id(), email=email)
        self.users[user.id] = user
        return user
type MockUserService struct {
    users map[string]*User
}

func (m *MockUserService) Create(email string) (*User, error) {
    user := &User{ID: generateID(), Email: email}
    m.users[user.ID] = user
    return user, nil
}

Phase 4: Implement Real Implementation

Now implement the actual service with real dependencies (database, APIs, etc).

class DatabaseUserService:
    def __init__(self, db: Database):
        self.db = db
    
    def create(self, email: str) -> User:
        return self.db.insert('users', {'email': email})
type DatabaseUserService struct {
    db *Database
}

func (s *DatabaseUserService) Create(email string) (*User, error) {
    return s.db.Insert("users", map[string]interface{}{"email": email})
}

Phase 5: Verify Tests Pass

Run your test suite to verify implementation correctness.

# Python
pytest tests/test_user_service.py -v
pytest --cov=user_service

# Go
go test ./... -v
go test -cover ./...

# Java
mvn test
gradle test

# Rust
cargo test
cargo test --verbose

Development Best Practices

1. Research Before Implementing

Evaluate Multiple Approaches: Before writing any code, explore different solutions:

# Research existing patterns in codebase
git grep -n "similar_pattern" 
rg "authentication" --type py  # Find existing auth implementations

# Check what approaches exist
# Example: "How to handle file uploads in large projects?"
# - Streaming vs buffering
# - Direct storage vs queue-based
# - Library vs custom implementation

Questions to Answer:

  • How is similar functionality already implemented in this codebase?
  • What are 3 different approaches to solve this?
  • Which approach fits the existing architecture?
  • Which is most maintainable long-term?
  • Which is most extensible for future changes?

Check External Resources:

# GitHub Issues for similar problems
# Search: "<library-name> <your-problem>" on GitHub Issues
# Example: "fastapi file upload large files"

# StackOverflow for real-world solutions
# Look for answers with multiple upvotes and recent activity
# Check comments for gotchas and edge cases

# Documentation examples
# Always check official docs first - they show intended patterns

2. Library Evaluation Checklist

Before adding any dependency, evaluate:

Health Indicators:

# Check GitHub repository
- Last commit date (< 6 months ago is good)
- Open issues count
- Issues closed vs open ratio (should be > 2:1)
- Pull request merge time (< 1 month is good)
- Number of contributors (more = better)
- Stars and forks (popularity indicator)

# Check package registry
npm info <package>        # npm
pip show <package>        # PyPI
go list -m <package>      # Go modules
cargo search <package>    # crates.io

Evaluation Template:

Criteria Threshold Notes
Last commit < 6 months Is it actively maintained?
Open issues < 100 for small libs, < 500 for large Are issues being addressed?
Issue close rate > 60% Do maintainers respond?
Dependencies < 10 direct deps Fewer dependencies = less risk
Bundle size Check for your use case Will it bloat the build?
Breaking changes Check changelog How often do they break APIs?
Community feedback Search " vs " What do real users say?

Red Flags:

  • No commits in > 1 year
  • More open issues than closed
  • Many unresolved security advisories
  • Abandoned by maintainer (check for notices)
  • Heavy dependencies for simple tasks

Example Evaluation:

# Evaluating a CSV parsing library
npm info papaparse
# - Last publish: 2 months ago ✓
# - Downloads: 5M/week ✓
# - Dependencies: 0 ✓
# - Issues: 180 open, 520 closed (74% close rate) ✓

# Compare alternatives
npm info csv-parser
# - Last publish: 3 years ago ✗
# - Downloads: 1M/week
# - Issues: 45 open, 30 closed (40% close rate) ✗

# Decision: Use papaparse (active maintenance, proven track record)

3. Decision Tracking with SQLite

IMPORTANT: Store all architectural and library decisions in a local SQLite database for future reference and pattern matching.

Setup (one-time):

# Install required tools
pip install sqlite-utils sqlite-utils-sqlite-vec

# Create decisions database
sqlite-utils create-database .dev/decisions.db

# Create tables
sqlite-utils create-table .dev/decisions.db decisions \
  id integer \
  timestamp text \
  category text \
  decision text \
  reasoning text \
  alternatives text \
  outcome text \
  --pk id

sqlite-utils create-table .dev/decisions.db library_evaluations \
  id integer \
  library_name text \
  version text \
  purpose text \
  health_score integer \
  last_commit text \
  issues_ratio real \
  decision text \
  notes text \
  --pk id

# Enable vector search for similarity
sqlite-utils install sqlite-vec

Recording Decisions:

# Record architectural decision
sqlite-utils insert .dev/decisions.db decisions \
  --csv << EOF
timestamp,category,decision,reasoning,alternatives,outcome
2025-10-24T10:30:00,architecture,Use event-driven architecture for order processing,"Decouples services, enables async processing, easier to scale","Monolithic approach (too coupled), Microservices (too complex)",Implemented successfully
EOF

# Record library evaluation
sqlite-utils insert .dev/decisions.db library_evaluations \
  --csv << EOF
library_name,version,purpose,health_score,last_commit,issues_ratio,decision,notes
papaparse,5.4.1,CSV parsing,9,2025-08-15,0.74,approved,Zero dependencies and active maintenance
csv-parser,3.0.0,CSV parsing,4,2022-05-10,0.40,rejected,Abandoned by maintainer
EOF

Querying Past Decisions:

# Find similar past decisions
sqlite-utils query .dev/decisions.db \
  "SELECT * FROM decisions WHERE category = 'authentication' ORDER BY timestamp DESC LIMIT 5"

# Check if we've evaluated a library before
sqlite-utils query .dev/decisions.db \
  "SELECT * FROM library_evaluations WHERE library_name LIKE '%upload%'"

# Find decisions that didn't work out
sqlite-utils query .dev/decisions.db \
  "SELECT * FROM decisions WHERE outcome LIKE '%reverted%' OR outcome LIKE '%failed%'"

# Get all libraries we rejected and why
sqlite-utils query .dev/decisions.db \
  "SELECT library_name, decision, notes FROM library_evaluations WHERE decision = 'rejected'"

Using Vector Search for Similar Problems:

# Script: .dev/find_similar_decisions.py
import sqlite3
import sqlite_vec
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')

def find_similar_decisions(query: str, limit: int = 5):
    """Find past decisions similar to current problem"""
    db = sqlite3.connect('.dev/decisions.db')
    db.enable_load_extension(True)
    sqlite_vec.load(db)
    
    # Generate embedding for query
    query_embedding = model.encode(query)
    
    # Search for similar decisions
    cursor = db.execute("""
        SELECT decision, reasoning, outcome, 
               vec_distance_cosine(embedding, ?) as similarity
        FROM decisions
        ORDER BY similarity ASC
        LIMIT ?
    """, (query_embedding.tobytes(), limit))
    
    return cursor.fetchall()

# Usage
similar = find_similar_decisions("How to handle large file uploads?")
for decision, reasoning, outcome, similarity in similar:
    print(f"Similarity: {similarity:.2f}")
    print(f"Decision: {decision}")
    print(f"Outcome: {outcome}\n")

Decision Template (save as .dev/DECISION_TEMPLATE.md):

# Decision: [Title]

**Date:** YYYY-MM-DD
**Category:** [architecture|library|pattern|tooling]
**Status:** [proposed|accepted|rejected|superseded]

## Context
What problem are we trying to solve?

## Decision
What did we decide to do?

## Alternatives Considered
1. **Alternative 1:** Brief description
   - Pros: ...
   - Cons: ...
   
2. **Alternative 2:** Brief description
   - Pros: ...
   - Cons: ...

## Reasoning
Why did we choose this approach?
- Fits existing architecture because...
- Most maintainable because...
- Most extensible because...

## Consequences
What are the implications?
- Positive: ...
- Negative: ...
- Risks: ...

## References
- GitHub Issue: #123
- StackOverflow: [link]
- Documentation: [link]

## Outcome (update later)
How did this decision work out?

4. Contract-First Design

  • Define interfaces/protocols/traits before implementation
  • Use dependency injection with contracts
  • Mock contracts for fast unit tests
  • Keep contracts focused and cohesive

5. Incremental Implementation Pattern

  1. Research - Find 3 approaches, evaluate against codebase
  2. Decide - Record decision in decisions.db
  3. Define the contract - What operations does this component expose?
  4. Build a mock - Simple in-memory implementation for testing
  5. Write tests - Against the contract, using the mock
  6. Implement real version - Satisfies same contract
  7. Review outcome - Update decisions.db with results

6. Test Coverage Goals

  • Unit tests: 80%+ coverage of business logic
  • Integration tests: Critical paths and data flows
  • E2E tests: Key user journeys
  • Run tests before commits (pre-commit hooks)

2. Debugging Cycle

The Core Principle

Logs reveal reality. Assumptions hide bugs. Always verify with actual data.

Simple Debugging Process

Step 1: Reproduce the Bug

# Document exact steps to trigger the bug
# Create a simple script or command that fails consistently

# Example: API endpoint fails
curl -X POST http://localhost:3000/api/orders \
  -H "Content-Type: application/json" \
  -d '{"userId": "123", "items": [{"id": "abc"}]}'

# Example: CLI command fails
npm run process-video -- --id=video_123

# If it's flaky, note the failure rate
# Try 10 times: 3 failures = 30% failure rate

Step 2: Add Logging to See What's Actually Happening

// Add logs at key points to see the ACTUAL data
async function processOrder(orderId: string) {
  console.log('[ORDER] Starting processOrder:', { orderId, timestamp: new Date() });
  
  const order = await getOrder(orderId);
  console.log('[ORDER] Fetched order:', { order });  // SEE ACTUAL ORDER DATA
  
  const items = await getOrderItems(order.id);
  console.log('[ORDER] Fetched items:', { itemCount: items.length, items });
  
  if (items.length === 0) {
    console.error('[ORDER] ERROR: No items found for order:', { orderId, order });
    throw new Error('Order has no items');
  }
  
  const validated = validateItems(items);
  console.log('[ORDER] Validation result:', { validated, itemCount: items.length });
  
  return validated;
}

Logging Best Practices:

  • Use consistent prefixes like [MODULE] for easy filtering
  • Log input parameters at function entry
  • Log intermediate results at each step
  • Log the actual values, not just "processing..."
  • Include context: IDs, counts, timestamps
  • Log before throwing errors

Step 3: Run the Code and Read the Logs

# Run and capture all output
npm run dev 2>&1 | tee debug.log

# Filter to your module
grep "\[ORDER\]" debug.log

# Find errors
grep -i "error\|exception\|failed" debug.log

# Get context around an error (10 lines before and after)
grep -B 10 -A 10 "No items found" debug.log

Step 4: Analyze What the Logs Show

Look for:

  • Unexpected values: Is order actually null? Is items.length really 0?
  • Wrong data types: Is orderId a string when it should be a number?
  • Missing data: Are fields undefined that you expected?
  • Timing issues: Does one thing happen before another when it should be after?
# Example log analysis showing the bug
[ORDER] Starting processOrder: { orderId: 'order_123' }
[ORDER] Fetched order: { order: { id: 'order_123', userId: 'user_456' } }
[ORDER] Fetched items: { itemCount: 0, items: [] }  # ← BUG: Why 0 items?
[ORDER] ERROR: No items found for order: { orderId: 'order_123' }

Step 5: Check Your Assumptions with Direct Queries

# Don't trust the code - verify directly in the database
sqlite3 database.db "SELECT * FROM order_items WHERE order_id = 'order_123';"

# Result shows 3 items exist!
# So the code is querying wrong...

# Check what the code is actually querying
# Add logging to the getOrderItems function
console.log('[DB] Query:', { sql, params });
// Example bug found: querying with wrong field
async function getOrderItems(orderId: string) {
  const sql = 'SELECT * FROM order_items WHERE id = ?';  // ← BUG: should be "order_id"
  console.log('[DB] Query:', { sql, params: [orderId] });
  return db.all(sql, [orderId]);
}

Step 6: Fix Based on What You Found

// Fix the actual problem found in logs
async function getOrderItems(orderId: string) {
  const sql = 'SELECT * FROM order_items WHERE order_id = ?';  // ✓ Fixed
  console.log('[DB] Query:', { sql, params: [orderId] });
  return db.all(sql, [orderId]);
}

Step 7: Verify the Fix Works

# Run the reproduction steps again
npm run process-order -- --id=order_123

# Check logs show correct behavior
grep "\[ORDER\]" debug.log
# Should now show: itemCount: 3, items: [...]

# Run tests
npm test

Step 8: Add a Test to Prevent Regression

// Write test that would have caught this bug
describe('getOrderItems', () => {
  it('should fetch items by order_id not id', async () => {
    await db.insert('order_items', { id: 'item_1', order_id: 'order_123' });
    
    const items = await getOrderItems('order_123');
    
    expect(items).toHaveLength(1);
    expect(items[0].id).toBe('item_1');
  });
});

When to Use Git History

Use git when you need to understand when and why code changed:

Find When Code Last Changed

# See recent commits affecting this file
git log --oneline -10 -- src/services/order.ts

# See actual changes
git log -p -- src/services/order.ts | less

# Find who changed a specific line
git blame src/services/order.ts | grep "getOrderItems"

# See commits from last week
git log --since="1 week ago" --oneline

Compare With Working Version

# Compare current code with last release
git diff v1.2.0 -- src/services/order.ts

# Compare with specific commit
git diff abc123 -- src/services/order.ts

# Show file contents from previous commit
git show HEAD~1:src/services/order.ts

Find the Commit That Broke Things (Binary Search)

# Only use this if you know when it worked before
git bisect start
git bisect bad HEAD              # Current version is broken
git bisect good v1.2.0           # This version worked

# Git will checkout commits for you to test
# Run your test at each commit
npm test

# Tell git if this commit is good or bad
git bisect good   # Test passed
git bisect bad    # Test failed

# Git finds the breaking commit
# "abc123 is the first bad commit"

git bisect reset  # Return to original state

Common Bug Patterns

Pattern 1: Wrong Parameter Passed

// Function expects video ID
async function getVideoSegments(videoId: string) {
  return db.query('SELECT * FROM segments WHERE video_id = ?', [videoId]);
}

// But called with job ID instead
const segments = await getVideoSegments(job.id);  // ❌ Wrong ID type

// Fix: Pass correct ID
const segments = await getVideoSegments(job.videoId);  // ✓

Pattern 2: Assuming Data Exists

// Code assumes user always exists
const userName = user.name.toUpperCase();  // ❌ Crashes if user is null

// Fix: Check first
const userName = user?.name?.toUpperCase() ?? 'Unknown';  // ✓

Pattern 3: Wrong Query Field

// Querying wrong column
db.query('SELECT * FROM users WHERE id = ?', [email]);  // ❌ id vs email

// Fix: Use correct column
db.query('SELECT * FROM users WHERE email = ?', [email]);  // ✓

Debugging Checklist

When stuck on a bug:

  • Can I reproduce it consistently?
  • Have I added logs to see actual data?
  • Have I run the code and read the logs?
  • Do the logs show unexpected values?
  • Have I verified my assumptions with direct queries?
  • Have I checked if the data actually exists?
  • Have I checked the function parameters match what's expected?
  • Have I looked at recent git changes to this code?
  • Have I written a test that reproduces the bug?

Remember

The Debugging Mantra:

  1. Add logs - You can't debug what you can't see
  2. Run code - See what actually happens
  3. Read logs - Find the unexpected values
  4. Verify assumptions - Check the database/files directly
  5. Fix the real problem - Not what you think the problem is
  6. Add a test - Prevent it from happening again

Never trust assumptions. Always verify with actual data. Instructions are language agnostic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment