Analyze the BillHawk project repository at /github/teo-mateo/ai-email-invoice-scraper and build a comprehensive knowledge graph following these instructions:
IMPORTANT: Execute each step independently. For each step, read the relevant project files to extract accurate information for nodes, observations, and relationships. Do not rely on assumptions or generic knowledge - base everything on actual code analysis.
Step 1: Create the Project Node
- Read README.md, Application.md, and CLAUDE.md files
- Create a Project node named "BillHawk"
- Add observations extracted directly from these files about purpose, architecture, tech stack, and key features
- Include file paths and specific details found in the documentation
Step 2: Create Architecture Layer Nodes For each category below, read the actual source files in the specified directories:
-
API Controllers (type: APIController) - Read all files in backend/src/Api/Controllers/
- Extract endpoints, HTTP methods, routes, auth requirements from attributes and methods
- Create relationships: controller->invokes->command/query based on actual constructor injections and method calls
-
Commands (type: Command) - Read all files in backend/src/Domain/Commands/
- Extract command names, parameters, return types from class definitions
- Identify which repositories are injected in handlers
- Create relationships: command->uses->repository, command->calls->repository_method
-
Queries (type: Query) - Read all files in backend/src/Domain/Queries/
- Extract query parameters and response types
- Identify repository dependencies from constructors
- Create relationships: query->uses->repository, query->calls->repository_method
-
Repositories (type: Repository) - Read all files in backend/src/Data/Repositories/
- Extract interface names and implementation classes
- Create relationships: repository->has_method->method
-
Repository Methods (type: RepositoryMethod) - From repository files
- Extract all public method signatures with parameters and return types
- Add observations about what each method does
-
Domain Services (type: DomainService) - Read files in backend/src/Domain/Services/
- Extract service classes and their purposes
- Identify dependencies and integrations
-
Domain Models (type: DomainModel) - Read files in backend/src/Data/Models/ and backend/src/Common/Models/
- Extract model classes with key properties
- Note any special attributes or configurations
-
DTOs (type: DTO) - Read response/request models throughout the codebase
- Extract DTO definitions with properties
- Create relationships: controller->returns->DTO, command->operates_on->model
Step 3: Create Infrastructure Nodes Read configuration files, project files, and imports to identify:
-
Technologies (type: Technology) - From *.csproj files and package.json
- Extract exact versions and configurations
-
Libraries (type: Library) - From Directory.Packages.props and package references
- Note specific versions and purposes based on usage
-
External Services (type: ExternalService) - From configuration and service implementations
- Extract URLs, endpoints, and integration details
-
Databases (type: Database) - From appsettings.json and connection strings
- Note specific SQL Server configuration and features used
-
Database Tables (type: DatabaseTable) - Read all files in backend/src/Data/Database/Migrations/Scripts/
- Extract table names, key columns, relationships
-
Protocols (type: Protocol) - From authentication and API implementations
- Document how OAuth, JWT, JMAP are specifically used
Step 4: Create Feature Nodes Read Process_flow.md and analyze the codebase to identify:
-
Features (type: Feature) - Major functional areas of the application
- Create relationships: feature->powered_by->service
-
Workflows (type: Workflow) - From Process_flow.md and command implementations
- Document step-by-step processes
-
Background Jobs (type: BackgroundJob) - From BackgroundTaskProcessor and related files
- Extract job types and scheduling
Step 5: Create Frontend Nodes Read all files in frontend/src/:
-
Pages (type: FrontendPage) - From frontend/src/pages/
- Extract page components and their routes
-
Components (type: FrontendComponent) - From frontend/src/components/
- Note key reusable components
- Create relationships: page->uses->component
-
Contexts (type: FrontendContext) - From frontend/src/contexts/
- Document React contexts and their purposes
-
API Services (type: FrontendService) - From frontend/src/services/
- Extract service functions and API calls
- Create relationships: component->calls->api_service
Step 6: Create Configuration Nodes Read all configuration files:
-
Configurations (type: Configuration) - From appsettings.json files
- Extract key configuration sections with actual values (excluding secrets)
-
Environment Variables (type: EnvironmentVariable) - From launchSettings.json and code
- Document required environment variables
-
Connection Strings (type: ConnectionString) - From configuration files
- Note database and service connection configurations
Step 7: Additional Intelligence While reading code files, also capture:
- Security implementations and considerations
- Performance optimizations or concerns noted in comments
- TODO comments and technical debt markers
- Test coverage from test files
- Error handling patterns
Step 8: Create Relationships Based on actual code analysis, create relationships:
- Project->has_controller/command/query/repository/feature
- Controller->invokes->command/query (from action methods)
- Command/Query->uses->repository (from constructor injection)
- Repository->has_method->method
- Method->calls->method (from implementations)
- Service->integrates_with->external_service
- Frontend relationships based on imports and usage
Execution Notes:
- Complete each step fully before moving to the next
- Always read source files rather than making assumptions
- Include file paths in observations (e.g., "Implemented in backend/src/Domain/Commands/CreateSupplier.cs:15")
- Extract actual values, versions, and configurations from files
- Create nodes and relationships based on evidence in code, not theoretical knowledge