graph TD
subgraph "Phase 1: Specification & Design"
A[Start: Project Idea] --> B{Spec Development w/ LLM};
B -- "Iterative Refinement" --> B;
B --> C[architecture.md];
C --> D{Task Decomposition w/ LLM};
D --> E[tasks.md];
end
subgraph "Phase 2: Implementation & Iteration"
E --> F[Select Task];
F --> G[Execute with AI Coding Assistant];
G --> H{Code Review?};
H -- "Yes (Every 3-4 Tasks)" --> I[AI Code Review];
I --> J[Incorporate Feedback];
J --> G;
H -- "No" --> K[Commit to Version Control];
K --> L{All Tasks Done?};
L -- "No" --> F;
end
subgraph "Phase 3: Finalization & Review"
L -- "Yes" --> M[Final AI Code Review];
M --> N[Final Human Review & Linting];
N --> O[Deployment];
end
G -- "Design Change Needed" --> C;
This document outlines a refined, spec-driven development process for building production-quality software in collaboration with AI agents. The methodology is founded on iterative loops, emphasizing detailed upfront planning and continuous feedback to guide AI agents effectively. This approach transforms the development lifecycle into a more efficient, predictable, and collaborative partnership between human developers and AI.
- Specification First: A well-defined architecture is the bedrock of the project. A significant portion of time should be dedicated to creating a clear and comprehensive specification before any implementation begins.
- Explicitness is Key: Communicate with AI agents as you would with a junior developer—with extreme clarity and no unstated assumptions. Define requirements, tech stacks, and expected behaviors precisely.
- Iterative Refinement: Treat all documents and code as living artifacts. Continuously improve specifications based on feedback discovered during implementation and review.
- Human-in-the-Loop: While AI automates repetitive tasks, human oversight is crucial for strategic decisions, architectural guidance, and final quality assurance.
- Frequent Integration: Regular commits and intermediate reviews act as safety nets, preventing divergence and ensuring the project stays on track.
Objective: To produce a comprehensive architectural specification (architecture.md) and a granular task list (tasks.md) that will guide the entire development process.
Workflow:
-
Initial Specification Generation:
- Partner with a high-level LLM (e.g., Qwen Coder, GLM 4.5, etc) to generate the initial
architecture.md. - This document should explicitly detail:
- Project Goals & Features: High-level objectives and specific user-facing functionalities.
- Technology Stack: Specific versions of frameworks, libraries, and databases.
- Data Models: Database schema definitions (e.g., DDL statements).
- API Contracts: Endpoints, request/response formats, and authentication methods.
- Project Structure: A clear file and directory layout.
- Non-Functional Requirements: Guidelines on logging, error handling, and security.
- Partner with a high-level LLM (e.g., Qwen Coder, GLM 4.5, etc) to generate the initial
-
Iterative Specification Refinement:
- Engage in a "conversation" with the LLM to refine the
architecture.md. The goal is to eliminate ambiguity. - A good rule of thumb is to allocate a significant portion of the total project time to this phase to prevent costly changes later.
- Engage in a "conversation" with the LLM to refine the
-
Task Decomposition:
- Once the architecture is stable, use an LLM to break it down into a sequential list of small, manageable, and testable tasks in
tasks.md. - Review and adjust this list, ensuring that foundational tasks (e.g., setting up the database) come first.
- Once the architecture is stable, use an LLM to break it down into a sequential list of small, manageable, and testable tasks in
Objective: To implement the solution by systematically working through the decomposed tasks, incorporating continuous feedback and integration.
Workflow:
-
Task Implementation:
- For each task, provide an AI coding assistant (e.g., Roo code, Cline in an IDE or cli like qwen code, open code) with the full context of
architecture.mdand the specific task fromtasks.md. - A standardized prompt is highly effective:
"Please review the
architecture.mddocument. Your task is to implement Task X fromtasks.md. Propose a plan first. After I approve, write clean, minimal, and thoroughly tested code. Ensure all tests pass before considering the task complete."
- For each task, provide an AI coding assistant (e.g., Roo code, Cline in an IDE or cli like qwen code, open code) with the full context of
-
Plan Review:
- Always ask the agent for a plan before it writes code. This is a crucial step to catch potential over-engineering or misunderstandings early.
-
Version Control:
- After each task is successfully implemented and tested, commit the changes to Git. Frequent commits provide a safety net and a clear history of progress.
-
Intermediate AI Code Reviews:
- Every 3-4 tasks, conduct an AI-powered code review. This helps catch issues early and maintains code quality throughout the process.
- Prompt:
"You are a principal software engineer with a critical eye for detail. Review the code implemented for tasks 1-4. Ground your review in the
architecture.mdand focus on code quality, test coverage, and adherence to the specification."
-
Specification Upkeep:
- If implementation reveals a necessary change in design, immediately update the
architecture.mdbefore proceeding. This ensures the specification remains the single source of truth.
- If implementation reveals a necessary change in design, immediately update the
Objective: To ensure the completed project is robust, compliant with the specification, and ready for deployment.
Workflow:
-
Final AI Review:
- Once all tasks are complete, perform a comprehensive AI review of the entire codebase. This review should focus on holistic aspects like security vulnerabilities, overall architectural integrity, and final adherence to REST practices or other established standards.
-
Final Human Review:
- A human developer should conduct the final review. This is the last opportunity to catch subtle logic errors, address linting issues, and ensure the project meets the highest quality standards.
-
Deployment:
- With all reviews complete and issues addressed, the project is ready to be deployed through your standard CI/CD pipeline.
- High-Level LLMs: Qwen Code, GLM 4.5, and so on.
- AI Coding Assistants: Cursor, an IDE with integrated AI capabilities.
- Knowledge Management: Obsidian, for its excellent linking and organization of markdown files.
- Version Control: Git.
- Terminal Management: tmux, for managing multiple processes like development servers and tests.
The ultimate goal is to evolve this framework into a fully automated system orchestrated by a primary agent:
- A "manager" agent oversees the
tasks.mdfile. - It dispatches individual tasks to specialized "developer" agents running in sandboxed environments (e.g., Docker containers).
- Upon task completion, the developer agent submits a pull request.
- A team of "reviewer" agents (specializing in security, architecture, and best practices) automatically reviews the PR.
- Based on feedback, the PR is either merged or sent back for automated revisions.
- A final human review gives the ultimate approval before triggering the CI/CD pipeline for deployment.