Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save teo-mateo/12e649a1e4a96a25c729733a73c499f8 to your computer and use it in GitHub Desktop.

Select an option

Save teo-mateo/12e649a1e4a96a25c729733a73c499f8 to your computer and use it in GitHub Desktop.

BillHawk Knowledge Graph Build Instructions

Analyze the BillHawk project repository at /github/teo-mateo/ai-email-invoice-scraper and build a comprehensive knowledge graph following these instructions:

IMPORTANT: Execute each step independently. For each step, read the relevant project files to extract accurate information for nodes, observations, and relationships. Do not rely on assumptions or generic knowledge - base everything on actual code analysis.

Step 1: Create the Project Node

  • Read README.md, Application.md, and CLAUDE.md files
  • Create a Project node named "BillHawk"
  • Add observations extracted directly from these files about purpose, architecture, tech stack, and key features
  • Include file paths and specific details found in the documentation

Step 2: Create Architecture Layer Nodes For each category below, read the actual source files in the specified directories:

  • API Controllers (type: APIController) - Read all files in backend/src/Api/Controllers/

    • Extract endpoints, HTTP methods, routes, auth requirements from attributes and methods
    • Create relationships: controller->invokes->command/query based on actual constructor injections and method calls
  • Commands (type: Command) - Read all files in backend/src/Domain/Commands/

    • Extract command names, parameters, return types from class definitions
    • Identify which repositories are injected in handlers
    • Create relationships: command->uses->repository, command->calls->repository_method
  • Queries (type: Query) - Read all files in backend/src/Domain/Queries/

    • Extract query parameters and response types
    • Identify repository dependencies from constructors
    • Create relationships: query->uses->repository, query->calls->repository_method
  • Repositories (type: Repository) - Read all files in backend/src/Data/Repositories/

    • Extract interface names and implementation classes
    • Create relationships: repository->has_method->method
  • Repository Methods (type: RepositoryMethod) - From repository files

    • Extract all public method signatures with parameters and return types
    • Add observations about what each method does
  • Domain Services (type: DomainService) - Read files in backend/src/Domain/Services/

    • Extract service classes and their purposes
    • Identify dependencies and integrations
  • Domain Models (type: DomainModel) - Read files in backend/src/Data/Models/ and backend/src/Common/Models/

    • Extract model classes with key properties
    • Note any special attributes or configurations
  • DTOs (type: DTO) - Read response/request models throughout the codebase

    • Extract DTO definitions with properties
    • Create relationships: controller->returns->DTO, command->operates_on->model

Step 3: Create Infrastructure Nodes Read configuration files, project files, and imports to identify:

  • Technologies (type: Technology) - From *.csproj files and package.json

    • Extract exact versions and configurations
  • Libraries (type: Library) - From Directory.Packages.props and package references

    • Note specific versions and purposes based on usage
  • External Services (type: ExternalService) - From configuration and service implementations

    • Extract URLs, endpoints, and integration details
  • Databases (type: Database) - From appsettings.json and connection strings

    • Note specific SQL Server configuration and features used
  • Database Tables (type: DatabaseTable) - Read all files in backend/src/Data/Database/Migrations/Scripts/

    • Extract table names, key columns, relationships
  • Protocols (type: Protocol) - From authentication and API implementations

    • Document how OAuth, JWT, JMAP are specifically used

Step 4: Create Feature Nodes Read Process_flow.md and analyze the codebase to identify:

  • Features (type: Feature) - Major functional areas of the application

    • Create relationships: feature->powered_by->service
  • Workflows (type: Workflow) - From Process_flow.md and command implementations

    • Document step-by-step processes
  • Background Jobs (type: BackgroundJob) - From BackgroundTaskProcessor and related files

    • Extract job types and scheduling

Step 5: Create Frontend Nodes Read all files in frontend/src/:

  • Pages (type: FrontendPage) - From frontend/src/pages/

    • Extract page components and their routes
  • Components (type: FrontendComponent) - From frontend/src/components/

    • Note key reusable components
    • Create relationships: page->uses->component
  • Contexts (type: FrontendContext) - From frontend/src/contexts/

    • Document React contexts and their purposes
  • API Services (type: FrontendService) - From frontend/src/services/

    • Extract service functions and API calls
    • Create relationships: component->calls->api_service

Step 6: Create Configuration Nodes Read all configuration files:

  • Configurations (type: Configuration) - From appsettings.json files

    • Extract key configuration sections with actual values (excluding secrets)
  • Environment Variables (type: EnvironmentVariable) - From launchSettings.json and code

    • Document required environment variables
  • Connection Strings (type: ConnectionString) - From configuration files

    • Note database and service connection configurations

Step 7: Additional Intelligence While reading code files, also capture:

  • Security implementations and considerations
  • Performance optimizations or concerns noted in comments
  • TODO comments and technical debt markers
  • Test coverage from test files
  • Error handling patterns

Step 8: Create Relationships Based on actual code analysis, create relationships:

  • Project->has_controller/command/query/repository/feature
  • Controller->invokes->command/query (from action methods)
  • Command/Query->uses->repository (from constructor injection)
  • Repository->has_method->method
  • Method->calls->method (from implementations)
  • Service->integrates_with->external_service
  • Frontend relationships based on imports and usage

Execution Notes:

  • Complete each step fully before moving to the next
  • Always read source files rather than making assumptions
  • Include file paths in observations (e.g., "Implemented in backend/src/Domain/Commands/CreateSupplier.cs:15")
  • Extract actual values, versions, and configurations from files
  • Create nodes and relationships based on evidence in code, not theoretical knowledge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment