The indexing pipeline (chunker → embeddings → sqlite-vec) is built and working. Now we need the query side: hybrid search combining FTS5 keyword search with sqlite-vec vector KNN, fused via Reciprocal Rank Fusion (RRF). This replaces the fixed recent-sessions loading with relevance-based retrieval at conversation start, and gives the agent a memory_search tool for mid-conversation lookups.
FTS5 is built into SQLite. A standalone FTS5 table (chunks_fts) is populated alongside vec_chunks during indexing. Content-sync mode (content=vec_chunks) won't work because SQLite doesn't fire triggers on virtual table writes. Duplicating the text in FTS5 is fine — these are small personal memory files.
Add to init_db flow (after vec_chunks creation):
CREATE VIRTUAL TABLE IF NOT EXISTS chunks_fts USING fts5(
content, tokenize='porter unicode61'
);porter gives stemming ("running" → "run"). Rowids align with vec_chunks rowids.
def _ensure_fts(conn):If chunks_fts doesn't exist, create it and backfill from vec_chunks. Called at end of init_db.
After each vec_chunks INSERT, capture cur.lastrowid and INSERT into chunks_fts with matching rowid:
cur = conn.execute("INSERT INTO vec_chunks ...", (...))
conn.execute("INSERT INTO chunks_fts (rowid, content) VALUES (?, ?)",
(cur.lastrowid, chunk.content))Before deleting from vec_chunks, SELECT rowids, then DELETE matching rows from chunks_fts:
rows = conn.execute("SELECT rowid FROM vec_chunks WHERE file_id = ?", (file_id,)).fetchall()
if rows:
placeholders = ",".join("?" * len(rows))
conn.execute(f"DELETE FROM chunks_fts WHERE rowid IN ({placeholders})", [r["rowid"] for r in rows])
conn.execute("DELETE FROM vec_chunks WHERE file_id = ?", (file_id,))@dataclass(frozen=True, slots=True)
class SearchResult:
content: str
score: float # RRF score, normalized 0-1
file_path: str
file_title: str | None
memory_type: str | None # "semantic" | "procedural" | "episodic"
start_line: int
end_line: int
chunk_rowid: int # for deduplicationsearch_vec(conn, query_embedding, *, limit=20) -> list[int]
sqlite-vec KNN query, returns chunk rowids in distance order:
SELECT rowid, distance FROM vec_chunks
WHERE embedding MATCH ? AND k = ?Query vector serialized via _serialize_f32.
search_fts(conn, query, *, limit=20) -> list[int]
FTS5 BM25 query, returns chunk rowids in rank order:
SELECT rowid, rank FROM chunks_fts
WHERE chunks_fts MATCH ? ORDER BY rank LIMIT ?Query sanitized by quoting each token to avoid FTS5 syntax errors ("each" "token" — implicit AND).
_sanitize_fts_query(query: str) -> str
Split on whitespace, quote each token, rejoin. Prevents colons/parens/operators from causing FTS5 parse errors.
_reciprocal_rank_fusion(*ranked_lists, k=60) -> list[tuple[int, float]]
- For each result at position
rank(1-indexed) in each list:score += 1/(k + rank) - Normalize to 0-1 by dividing by
n_lists / (k + 1)(theoretical max) - Return
[(rowid, score)]sorted descending by score
search(query, *, model="qwen3-embedding:0.6b", limit=10, min_score=0.0, mode="hybrid") -> list[SearchResult]
Flow:
- Open connection via
_db_path()+_connect()— return[]if no DB - If mode includes vec:
embed(query, model=model)[0]→search_vec(conn, vec, limit=limit*2) - If mode includes fts:
search_fts(conn, query, limit=limit*2) - Fuse with RRF (even for single-mode — gives normalized scores)
- Filter by
min_score, trim tolimit - Hydrate results by joining
vec_chunks↔filesonfile_id - Close connection, return
list[SearchResult]
Over-fetch limit*2 from each source so RRF fusion has enough candidates.
_run_search_tool(name, args) -> str
JSON wrapper around search(). Returns {"results": [...]} or {"error": "..."}.
Add memory_search tool to ANTHROPIC_TOOLS (OLLAMA_TOOLS derives automatically):
query(required string): what to search forlimit(optional int): max results, default 5
Route in run_tool: if name == "memory_search": return _run_search_tool(name, args) (imported from tars.search).
_search_relevant_context(opening_message, limit=5) -> str
Calls search(opening_message, limit=limit, min_score=0.25), formats results as labeled blocks.
_build_system_prompt(*, search_context="")
Add search_context kwarg. Replace the _load_recent_sessions() call with the passed-in search context. <recent-sessions> tag becomes <relevant-context>. Memory.md stays always-loaded (it's small and semantic).
chat(), chat_anthropic(), chat_ollama()
Thread search_context: str = "" kwarg through to _build_system_prompt(search_context=search_context).
Single-shot mode: before calling chat(), run _search_relevant_context(message), pass result through.
REPL mode: search on the first user message only. Store the result and pass it to chat() on the first call. Subsequent turns use the memory_search tool if needed.
Both wrapped in try/except with warning to stderr on failure.
Mock setup: same pattern as test_indexer.py (module-level ollama mock, mock.patch.object(embeddings, "ollama"), real sqlite-vec + temp dirs).
Key cases:
_sanitize_fts_query: handles special chars, empty input_reciprocal_rank_fusion: merges overlapping lists, scores normalized 0-1, single list workssearch_vec: returns rowids in distance order for known embeddingssearch_fts: returns rowids matching keyword querysearchhybrid end-to-end: mockembed(), insert chunks, verify results contain expected contentsearchmode="vec" and mode="fts": single-source modes worksearchwithmin_scorefilter: low-scoring results excludedsearchempty/missing DB: returns[]- FTS sync on delete: after
delete_chunks_for_file, FTS table is also cleaned - FTS backfill: DB with vec_chunks but no chunks_fts gets FTS populated on
_ensure_fts _run_search_tool: returns well-formed JSON- Startup search:
_search_relevant_contextreturns formatted string or empty
- FTS5 in db.py — schema,
_ensure_fts, modifyinsert_chunks/delete_chunks_for_file/delete_file. Run existing tests to confirm no regressions. - search.py —
SearchResult,_sanitize_fts_query,search_vec,search_fts,_reciprocal_rank_fusion,search(),_run_search_tool. - test_search.py — all search tests.
- tools.py —
memory_searchtool definition + routing. - core.py —
_search_relevant_context, modify_build_system_prompt, thread throughchat(). - cli.py — wire startup search for first message.
uv run python -m unittest discover -s tests -v
uv run tars index
uv run tars "what do you know about me?"
- No new dependencies. FTS5 built into SQLite, everything else exists.
- Distance metric: L2 (sqlite-vec default). Works fine for KNN ranking.
- RRF k=60: standard value from the original paper. No tuning needed.
- Graceful degradation: empty DB, missing dir, no index → empty results, no exceptions.
_load_recent_sessionsremoved from startup path — replaced by search-based retrieval. The function stays in memory.py for now (session compaction still uses it).