Skip to content

Instantly share code, notes, and snippets.

@decagondev
Created January 22, 2026 16:35
Show Gist options
  • Select an option

  • Save decagondev/12f829d45e70053f913acf41a4a1bcf7 to your computer and use it in GitHub Desktop.

Select an option

Save decagondev/12f829d45e70053f913acf41a4a1bcf7 to your computer and use it in GitHub Desktop.

RAG Agent Implementation Plan

Architecture Overview

The RAG Agent will follow the existing modular patterns in the codebase, providing both a dedicated agent routable via the supervisor and standalone tools that any agent can use.

graph TB
    subgraph rag_components [RAG Components]
        RAGAgent[RAGAgent]
        DocumentLoaders[Document Loaders]
        ChunkProcessor[Chunk Processor]
        VectorStore[VectorStore Manager]
        KeywordExtractor[Keyword Extractor]
    end

    subgraph tools [RAG Tools]
        IngestTool[rag_ingest]
        RetrieveTool[rag_retrieve]
        SearchTool[rag_search]
    end

    subgraph storage [In-Memory Storage]
        ChromaDB[ChromaDB]
        MetadataStore[Metadata Index]
    end

    RAGAgent --> tools
    tools --> DocumentLoaders
    tools --> VectorStore
    DocumentLoaders --> ChunkProcessor
    ChunkProcessor --> KeywordExtractor
    ChunkProcessor --> VectorStore
    VectorStore --> ChromaDB
    VectorStore --> MetadataStore
Loading

File Structure

New files to create:

  • app/rag/__init__.py - RAG module exports
  • app/rag/loaders.py - Document loaders for PDF, DOCX, TXT, RTF, HTML, JSON
  • app/rag/chunker.py - Intelligent text chunking with metadata
  • app/rag/vectorstore.py - ChromaDB wrapper with hybrid search
  • app/rag/keywords.py - Keyword extraction utilities
  • app/agents/implementations/rag.py - RAGAgent implementation
  • app/tools/rag_tools.py - RAG tools (ingest, retrieve, search)

Modified files:

Implementation Details

1. Document Loaders (app/rag/loaders.py)

Support for multiple file formats using langchain community loaders and fallbacks:

  • PDF: PyPDFLoader or pdfplumber
  • DOCX: python-docx library
  • Plain Text: Native Python
  • RTF: striprtf library
  • HTML: BeautifulSoup with html2text
  • JSON: Native Python with structured extraction

Each loader returns a standardized Document object with content and metadata (source, file type, creation date).

2. Chunk Processor (app/rag/chunker.py)

Intelligent chunking strategy:

  • Use RecursiveCharacterTextSplitter for semantic boundaries
  • Configurable chunk size (default 1000 chars) and overlap (200 chars)
  • Preserve metadata through chunking
  • Generate chunk IDs for deduplication
  • Extract and attach keywords per chunk

Chunk schema:

@dataclass
class DocumentChunk:
    id: str
    content: str
    metadata: dict  # source, chunk_index, file_type, created_at
    keywords: list[str]
    embedding: list[float] | None

3. Vector Store Manager (app/rag/vectorstore.py)

ChromaDB-based storage with hybrid search:

  • In-memory ChromaDB collection
  • Sentence Transformers embeddings (all-MiniLM-L6-v2 - fast and efficient)
  • Cosine similarity search for semantic retrieval
  • Keyword search using ChromaDB's metadata filtering
  • Hybrid search combining both approaches with configurable weights

Key methods:

  • add_documents(chunks) - Embed and store chunks
  • similarity_search(query, k) - Pure vector search
  • keyword_search(keywords, k) - Metadata-based keyword search
  • hybrid_search(query, keywords, k) - Combined search with reranking

4. Keyword Extractor (app/rag/keywords.py)

Extract meaningful keywords from text:

  • TF-IDF based extraction for important terms
  • Named entity recognition (optional, using spaCy if available)
  • Stop word filtering
  • Configurable keyword count per chunk

5. RAG Agent (app/agents/implementations/rag.py)

Extends BaseAgent with mode-aware behavior:

class RAGAgent(BaseAgent):
    # Modes: "ingest" or "retrieve"
    # Ingest: Load documents into vector store
    # Retrieve: Query vector store for relevant context

The agent will use the RAG tools and intelligently choose between ingestion and retrieval based on the user's request.

6. RAG Tools (app/tools/rag_tools.py)

Three LangChain tools registered with the ToolRegistry:| Tool | Purpose | Parameters ||------|---------|------------|| rag_ingest | Load documents into vector store | file_path, file_type (optional) || rag_retrieve | Get relevant context via hybrid search | query, top_k, search_type || rag_search | Keyword-only search | keywords, top_k |

Dependencies to Add

chromadb = "^0.4.0"
sentence-transformers = "^2.2.0"
pypdf = "^4.0.0"
python-docx = "^1.0.0"
striprtf = "^0.0.26"
beautifulsoup4 = "^4.12.0"
html2text = "^2024.2.26"
scikit-learn = "^1.4.0"  # For TF-IDF keyword extraction

Configuration

Add to app/config/agents.yaml:

rag:
  name: "RAG"
  tools:
    - "rag_ingest"
    - "rag_retrieve"
    - "rag_search"
  system_prompt: >
    You are a RAG (Retrieval-Augmented Generation) agent specialized in
    document management and knowledge retrieval. You can:
    - Ingest documents (PDF, DOCX, TXT, RTF, HTML, JSON) into the knowledge base
    - Retrieve relevant information using semantic and keyword search
    - Provide context-aware responses based on stored documents

Data Flow

sequenceDiagram
    participant User
    participant RAGAgent
    participant IngestTool
    participant RetrieveTool
    participant Loaders
    participant Chunker
    participant VectorStore
    participant ChromaDB

    Note over User,ChromaDB: Ingest Mode
    User->>RAGAgent: Load document
    RAGAgent->>IngestTool: file_path
    IngestTool->>Loaders: Parse file
    Loaders-->>IngestTool: Raw content + metadata
    IngestTool->>Chunker: Split content
    Chunker-->>IngestTool: Chunks with keywords
    IngestTool->>VectorStore: Store chunks
    VectorStore->>ChromaDB: Embed and persist
    VectorStore-->>RAGAgent: Success + stats

    Note over User,ChromaDB: Retrieve Mode
    User->>RAGAgent: Query
    RAGAgent->>RetrieveTool: query, search_type
    RetrieveTool->>VectorStore: hybrid_search
    VectorStore->>ChromaDB: Vector + keyword search
    ChromaDB-->>VectorStore: Matched chunks
    VectorStore-->>RetrieveTool: Ranked results
    RetrieveTool-->>RAGAgent: Context chunks
    RAGAgent-->>User: Augmented response
Loading

Implementation Order

  1. Set up dependencies and module structure
  2. Implement document loaders
  3. Implement chunking with keyword extraction
  4. Implement ChromaDB vector store wrapper
  5. Create RAG tools
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment