The RAG Agent will follow the existing modular patterns in the codebase, providing both a dedicated agent routable via the supervisor and standalone tools that any agent can use.
graph TB
subgraph rag_components [RAG Components]
RAGAgent[RAGAgent]
DocumentLoaders[Document Loaders]
ChunkProcessor[Chunk Processor]
VectorStore[VectorStore Manager]
KeywordExtractor[Keyword Extractor]
end
subgraph tools [RAG Tools]
IngestTool[rag_ingest]
RetrieveTool[rag_retrieve]
SearchTool[rag_search]
end
subgraph storage [In-Memory Storage]
ChromaDB[ChromaDB]
MetadataStore[Metadata Index]
end
RAGAgent --> tools
tools --> DocumentLoaders
tools --> VectorStore
DocumentLoaders --> ChunkProcessor
ChunkProcessor --> KeywordExtractor
ChunkProcessor --> VectorStore
VectorStore --> ChromaDB
VectorStore --> MetadataStore
New files to create:
app/rag/__init__.py- RAG module exportsapp/rag/loaders.py- Document loaders for PDF, DOCX, TXT, RTF, HTML, JSONapp/rag/chunker.py- Intelligent text chunking with metadataapp/rag/vectorstore.py- ChromaDB wrapper with hybrid searchapp/rag/keywords.py- Keyword extraction utilitiesapp/agents/implementations/rag.py- RAGAgent implementationapp/tools/rag_tools.py- RAG tools (ingest, retrieve, search)
Modified files:
app/tools/registry.py- Register RAG toolsapp/agents/implementations/__init__.py- Export RAGAgentapp/config/agents.yaml- RAG agent configurationpyproject.toml- Add new dependencies
Support for multiple file formats using langchain community loaders and fallbacks:
- PDF:
PyPDFLoaderorpdfplumber - DOCX:
python-docxlibrary - Plain Text: Native Python
- RTF:
striprtflibrary - HTML:
BeautifulSoupwithhtml2text - JSON: Native Python with structured extraction
Each loader returns a standardized Document object with content and metadata (source, file type, creation date).
Intelligent chunking strategy:
- Use
RecursiveCharacterTextSplitterfor semantic boundaries - Configurable chunk size (default 1000 chars) and overlap (200 chars)
- Preserve metadata through chunking
- Generate chunk IDs for deduplication
- Extract and attach keywords per chunk
Chunk schema:
@dataclass
class DocumentChunk:
id: str
content: str
metadata: dict # source, chunk_index, file_type, created_at
keywords: list[str]
embedding: list[float] | NoneChromaDB-based storage with hybrid search:
- In-memory ChromaDB collection
- Sentence Transformers embeddings (
all-MiniLM-L6-v2- fast and efficient) - Cosine similarity search for semantic retrieval
- Keyword search using ChromaDB's metadata filtering
- Hybrid search combining both approaches with configurable weights
Key methods:
add_documents(chunks)- Embed and store chunkssimilarity_search(query, k)- Pure vector searchkeyword_search(keywords, k)- Metadata-based keyword searchhybrid_search(query, keywords, k)- Combined search with reranking
Extract meaningful keywords from text:
- TF-IDF based extraction for important terms
- Named entity recognition (optional, using spaCy if available)
- Stop word filtering
- Configurable keyword count per chunk
Extends BaseAgent with mode-aware behavior:
class RAGAgent(BaseAgent):
# Modes: "ingest" or "retrieve"
# Ingest: Load documents into vector store
# Retrieve: Query vector store for relevant contextThe agent will use the RAG tools and intelligently choose between ingestion and retrieval based on the user's request.
Three LangChain tools registered with the ToolRegistry:| Tool | Purpose | Parameters ||------|---------|------------|| rag_ingest | Load documents into vector store | file_path, file_type (optional) || rag_retrieve | Get relevant context via hybrid search | query, top_k, search_type || rag_search | Keyword-only search | keywords, top_k |
chromadb = "^0.4.0"
sentence-transformers = "^2.2.0"
pypdf = "^4.0.0"
python-docx = "^1.0.0"
striprtf = "^0.0.26"
beautifulsoup4 = "^4.12.0"
html2text = "^2024.2.26"
scikit-learn = "^1.4.0" # For TF-IDF keyword extractionAdd to app/config/agents.yaml:
rag:
name: "RAG"
tools:
- "rag_ingest"
- "rag_retrieve"
- "rag_search"
system_prompt: >
You are a RAG (Retrieval-Augmented Generation) agent specialized in
document management and knowledge retrieval. You can:
- Ingest documents (PDF, DOCX, TXT, RTF, HTML, JSON) into the knowledge base
- Retrieve relevant information using semantic and keyword search
- Provide context-aware responses based on stored documentssequenceDiagram
participant User
participant RAGAgent
participant IngestTool
participant RetrieveTool
participant Loaders
participant Chunker
participant VectorStore
participant ChromaDB
Note over User,ChromaDB: Ingest Mode
User->>RAGAgent: Load document
RAGAgent->>IngestTool: file_path
IngestTool->>Loaders: Parse file
Loaders-->>IngestTool: Raw content + metadata
IngestTool->>Chunker: Split content
Chunker-->>IngestTool: Chunks with keywords
IngestTool->>VectorStore: Store chunks
VectorStore->>ChromaDB: Embed and persist
VectorStore-->>RAGAgent: Success + stats
Note over User,ChromaDB: Retrieve Mode
User->>RAGAgent: Query
RAGAgent->>RetrieveTool: query, search_type
RetrieveTool->>VectorStore: hybrid_search
VectorStore->>ChromaDB: Vector + keyword search
ChromaDB-->>VectorStore: Matched chunks
VectorStore-->>RetrieveTool: Ranked results
RetrieveTool-->>RAGAgent: Context chunks
RAGAgent-->>User: Augmented response
- Set up dependencies and module structure
- Implement document loaders
- Implement chunking with keyword extraction
- Implement ChromaDB vector store wrapper
- Create RAG tools