dpaola2/personal-os-obsidian-claude-code-mcp.md

## personal-os-obsidian-claude-code-mcp.md

      
    Raw
  

              personal-os-obsidian-claude-code-mcp.md
            
          
    Building a Personal OS with Obsidian + Claude Code + Custom MCP

The Pitch

I built a "Personal Operating System" -- an Obsidian vault that serves as a unified life management system with Claude (via Claude Code CLI) as the primary interface. It covers work, career, family, personal growth, side ventures, and even recipes.
The interesting parts:

Claude is the primary interface, not Obsidian's GUI. The vault is designed to be read and written by an AI agent, with Obsidian as a secondary browsing layer.
A project-level CLAUDE.md file teaches Claude the system's full conventions, folder structure, and linking rules -- every session.
A custom MCP server provides Claude with knowledge graph tools over the vault: backlinks, entity resolution, mention scanning, broken link detection.
Cadenced rituals (daily, weekly, monthly, quarterly) define what Claude should surface and ask about at each time horizon.

This post walks through the architecture, the entity index (the most technically interesting part), and what I learned building it.

System Architecture

The Vault Structure

00-home/          Session state, current priorities, calendar look-aheads
01-rituals/       Cadenced session prompts (daily, weekly, monthly, quarterly)
02-playbooks/     Reusable procedures (hiring, 1:1s, incident response, career prep)
03-living-docs/   Evolving knowledge (management philosophy, book notes, patterns)
04-meetings/      Meeting notes, 1:1 running docs, session logs
05-projects/      Project tracking (work projects, side ventures)
06-people/        Unified people directory (work, family, contacts -- all in one place)
07-personal/      Personal life domains (family planning, health, finances)
08-recipes/       Recipe collection with meal planning

The numbered folders are an Obsidian convention -- they create visual hierarchy in the sidebar. The key design choice: people are unified in one folder regardless of context. Your coworker and your spouse live next to each other, differentiated by tags. This reflects reality: life domains bleed into each other, and your AI assistant needs the full picture.
How Sessions Work

Every Claude Code session starts the same way:

Claude reads CLAUDE.md (loaded automatically as project instructions) which contains the full system spec: folder structure, conventions, linking rules, life context, and what files to read first.
Claude reads _current.md -- a lean session-state file with current priorities, open decisions, active to-dos, and a compressed log of recent sessions.
Claude checks the date and determines which rituals apply (Monday? Do weekly planning. First of the month? Generate the 30-day look-ahead).
From there, it's a conversation. Claude drives the ritual; I think and decide.

The _current.md file is deliberately kept lean. Full session narratives go to monthly archive files. Stable instructions live in CLAUDE.md. The principle: every file Claude reads at session start burns tokens before any real work begins. Context budget is a first-class design constraint.
MCP Integrations

The system connects to external tools via Model Context Protocol (MCP):


Integration
Purpose
Access Level


Slack (2 workspaces)
Community monitoring, user research
Read-only (hard rule in CLAUDE.md)


Google Calendar (via Zapier)
Daily briefings, weekly planning
Read


Google Drive (via Zapier)
Shared household docs, budgets
Read


Linear
Project/issue tracking
Read/Write


Custom Entity Index
Knowledge graph over the vault
Read (local SQLite)


The MCP config lives in .mcp.json at the project root. Claude Code launches each server as a subprocess and exposes their tools alongside its built-in file operations.

The Entity Index: A Knowledge Graph Layer for Your Vault

This is the part I find most interesting. Without it, Claude can read files but has no structural awareness of the knowledge graph. It can't answer "who links to this file?" or "is this person mentioned anywhere besides their profile?" or "are there broken links?"
What It Is

A custom MCP server (~450 lines of TypeScript) that:

Parses every .md file in the vault
Builds a SQLite database with 4 tables
Exposes 7 tools that Claude can call during any session
Auto-rebuilds when file modification times change

The Schema

files     -- path, title, type, tags, last_modified
links     -- source_file_id, target_path, target_file_id, display_text, line_number
aliases   -- file_id, alias, alias_lower, alias_type, word_count
mentions  -- alias_id, file_id, line_number, context_snippet
The files and links tables are straightforward -- parse wiki-links out of every markdown file, resolve targets to actual file paths, track what links to what.
The aliases and mentions tables are where it gets interesting.
Entity Resolution via Alias Derivation

For every .md file, the parser derives aliases from 5 sources:

Filename -- dehyphenate the stem: Jane-Smith.md becomes "Jane Smith"
H1 heading -- the first # Heading in the document
Frontmatter aliases: -- explicit aliases in YAML: aliases: [JS, Jane]
Frontmatter full-name: -- formal name: full-name: Jane Elizabeth Smith
Email addresses -- any email found in the file body

This means Claude can call vault_resolve_entity("Jane Smith") or vault_resolve_entity("jane@example.com") and get back the canonical file path, all known aliases, and link counts.
Files with a _ prefix (like _index.md, _current.md) are skipped for alias derivation -- they're structural files, not entities.
Mention Scanning

The scanner finds plain-text references to entities that are NOT wiki-links. This catches the gap between "formally linked" and "mentioned in passing."
Key design decisions:

Only scans for multi-word aliases and emails. Single words like "Jane" would produce too many false positives across 800+ files.
Strips wiki-link regions before matching. So [[Jane Smith]] doesn't double-count as both a link AND a mention.
Enforces word boundaries. "Smithfield" doesn't match "Smith".
Captures ~80 characters of context around each match for quick triage.

The 7 Tools


Tool
Question It Answers


vault_backlinks
"Who links TO this file?"


vault_forward_links
"What does this file link TO?"


vault_stats
"How healthy is the vault? Any orphans or broken links?"


vault_search_links
"Find files matching this name fragment, ranked by link count"


vault_resolve_entity
"Which canonical file does this name/alias/email map to?"


vault_mentions
"Where is this entity mentioned in plain text (not wiki-linked)?"


vault_rebuild
"Force a full re-index"


Every tool call first checks if any .md file has been modified since the last rebuild, and if so, does a full re-parse. The full rebuild runs in under 3 seconds for ~850 files.
The Tech Stack

Minimal by design:

TypeScript with the official @modelcontextprotocol/sdk
better-sqlite3 for the index (WAL mode, foreign keys, CASCADE deletes)
zod for tool parameter validation
stdio transport -- Claude Code launches it as a subprocess, communicates over stdin/stdout

Total dependencies: 2 runtime, 3 dev. No framework, no ORM, no build system beyond tsc.

Knowledge Graph Conventions

The vault aspires to zettelkasten principles -- not fully atomic (one idea per note), but with the two defining properties: atomicity (one concept per note where practical) and copious bidirectional linking.
The Linking Protocol

Every file creation or modification must follow these rules:

Every non-trivial file gets a Cross-References section at the bottom with wiki-links to related content.
Bidirectional links are mandatory. If A links to B, B must link back to A.
People get wiki-linked by name. [[06-people/Jane|Jane]], not just "Jane".
Patterns get linked to their canonical location. If a concept is observed across multiple people or contexts, it becomes a standalone note.

Note Type Taxonomy


Type
Purpose
Example


Hub/Index
Navigation; aggregates links to a category
books/_index.md


Atomic note
One concept, idea, or framework
A book note, a pattern note


Profile
Person or entity; grows over time
Jane-Smith.md


Record
Point-in-time capture
Meeting notes, session logs


Playbook
Reusable procedure
1-on-1-Playbook.md


Transcript
Raw verbatim source (VTT)
Never read at session start; drill in when exact wording matters


The Atomicity Heuristic


"If a section could be referenced independently from multiple other notes, it should probably be its own note. If two notes always get read together, consider merging them."

In practice: a concept observed across multiple people or contexts gets extracted into a standalone note. A concept specific to one person stays inline in their profile.

The Ritual System

Rituals are triggered by when you start a session, not by a schedule you follow. Claude checks the date and figures out what applies.
Daily: Pull calendar events, flag meetings needing prep, surface relevant context from people profiles. No document generated -- just orient for the day.
Weekly (Monday): Generate a 7-day look-ahead. Review open to-dos. Check waiting-on items. Identify meetings needing prep docs. Pull in shared household task list from Google Drive.
Weekly (Friday): Review what shipped vs. what was planned. Capture contributions worth documenting. Preview next week.
Monthly: Generate 30-day look-ahead. Archive session logs. Pattern check across recent sessions. Knowledge graph maintenance -- review recently-edited files for missing cross-references, check if new patterns deserve standalone notes, flag new docs lacking bidirectional links.
Quarterly: Synthesize the quarter's themes. Check alignment between time spent and stated priorities. Look ahead at major events.
Ad-hoc: Before any 1:1, pull the person's profile and last meeting notes. Before any important meeting, surface relevant patterns and context. Before any incident, pull the response playbook.
The key insight: Claude does the data-gathering; you do the thinking. The rituals replaced a set of manual templates that never got filled out. Having Claude drive the process means it actually happens.

What We Found: A Compliance Audit

We recently ran the entity index against the vault's own conventions and found a pile of problems:

43 of 58 people files were missing full-name: frontmatter -- meaning the entity index couldn't derive their formal name as an alias.
6 files used a naming convention that produced wrong aliases. A file named Jane-DO.md (where "DO" was an organizational abbreviation) would produce the alias "Jane Do" instead of the person's actual last name.
An empty stub file was creating confusion in entity resolution.

We fixed all of it in one pass: renamed 6 files, added full-name: frontmatter, and updated 51 wiki-link references across 17 files -- with zero remaining broken references.
The lesson: your knowledge graph conventions are only as good as your compliance with them. And running an automated audit is the fastest way to find the gaps.

Lessons Learned

1. Start with naming conventions early

We didn't establish file naming conventions (like Firstname-Lastname.md) until well into the project. By then, inconsistencies had compounded -- abbreviations in filenames, missing frontmatter, wiki-links pointing at the wrong targets. Every file you create before establishing conventions is a file you'll rename later.
The fix: Define your naming rules in CLAUDE.md on day one. Have Claude enforce them as a pre-flight check on every file creation.
2. Entity resolution needs multi-word names

Single-word aliases ("Jane", "Matt") are nearly useless for mention scanning -- they produce too many false positives across hundreds of files. The signal is in multi-word names ("Jane Smith") and email addresses.
This has a practical implication: frontmatter full-name: fields are not optional polish -- they're load-bearing infrastructure for entity resolution. If your people files only have single-word filenames and no full-name: frontmatter, the mention scanner can't find them.
3. Context budget forces good design

Claude Code reads your CLAUDE.md and any files you tell it to read at session start. Every file burns tokens -- tokens that aren't available for actual work. This creates a healthy pressure:

Keep your session-state file (_current.md) ruthlessly lean. Active state only.
Move stable instructions into CLAUDE.md (loaded once by the system, not re-read per session).
Full narratives go to archives. Drill in only when needed.
Prefer pointers (wiki-links, file references) over inline duplication.

The constraint sounds annoying, but it produces a system that's actually well-organized. If you had unlimited context, you'd never clean anything up.
4. Claude-first design is different from human-first design

An Obsidian vault designed for human browsing wants rich inline content, embedded images, and visual hierarchy. A vault designed for Claude wants:

Structured frontmatter that Claude can parse predictably
Consistent linking conventions that the entity index can rely on
Lean files with pointers instead of fat files with everything inline
Cross-References sections that make the graph navigable without reading full content
Clear note-type taxonomy so Claude knows what kind of file it's looking at

This doesn't mean the vault is ugly in Obsidian -- it just means the design decisions optimize for machine-readability first, with human browsing as a nice side effect.
5. Build the index before you need it

The entity index started as a "Level 1" backlink tracker (just files and links) and grew to "Level 2" (aliases and mentions). Each level was built because we needed it -- we hit a wall where Claude couldn't answer a question without structural graph awareness.
If I started over, I'd build the entity index on day one. The ability to run vault_stats and see orphans and broken links is invaluable for maintaining vault health. The ability to run vault_resolve_entity saves Claude from guessing which file to read. And mention scanning catches the references that slip through without formal wiki-links.
6. MCP is the right abstraction

The Model Context Protocol is what makes this all work. The entity index runs as a local subprocess, communicates over stdio, and exposes typed tools with zod-validated parameters. Claude Code discovers the tools at startup and can call them like any other function.
The alternative would be to bake graph queries into the CLAUDE.md instructions ("to find backlinks, run this grep command..."). That's fragile, slow, and error-prone. MCP gives you a clean boundary between "tool implementation" and "tool usage" -- and the tool descriptions are all Claude needs to figure out when and how to use them.

The Stack


Component
Technology
Purpose


Knowledge base
Obsidian (just the vault, no plugins required)
File storage, browsing, graph view


AI interface
Claude Code CLI
Primary interaction layer


Project instructions
CLAUDE.md
System conventions, loaded every session


Entity index
TypeScript + better-sqlite3 + MCP SDK
Knowledge graph tools


External integrations
MCP servers (Slack, Calendar, Drive, Linear)
Data access


Session state
_current.md
Lean startup file for each session


Try It Yourself

The core idea is simple: an Obsidian vault + a CLAUDE.md file + Claude Code. You don't need the entity index to start -- that's an optimization for when the vault gets large enough that Claude can't navigate by filename alone.

Create an Obsidian vault with whatever folder structure makes sense for your life.
Write a CLAUDE.md that teaches Claude your conventions: folder structure, linking rules, what to read at session start, what domains you're managing.
Open the vault in Claude Code (claude from the vault directory).
Start a session. See what Claude surfaces. Iterate on the CLAUDE.md based on what works and what doesn't.

The system will evolve. Don't over-engineer it upfront. Add structure when life demands it -- not before.

Built with Claude Code and Model Context Protocol.
Integration	Purpose	Access Level
Slack (2 workspaces)	Community monitoring, user research	Read-only (hard rule in CLAUDE.md)
Google Calendar (via Zapier)	Daily briefings, weekly planning	Read
Google Drive (via Zapier)	Shared household docs, budgets	Read
Linear	Project/issue tracking	Read/Write
Custom Entity Index	Knowledge graph over the vault	Read (local SQLite)
Tool	Question It Answers
`vault_backlinks`	"Who links TO this file?"
`vault_forward_links`	"What does this file link TO?"
`vault_stats`	"How healthy is the vault? Any orphans or broken links?"
`vault_search_links`	"Find files matching this name fragment, ranked by link count"
`vault_resolve_entity`	"Which canonical file does this name/alias/email map to?"
`vault_mentions`	"Where is this entity mentioned in plain text (not wiki-linked)?"
`vault_rebuild`	"Force a full re-index"
Type	Purpose	Example
Hub/Index	Navigation; aggregates links to a category	`books/_index.md`
Atomic note	One concept, idea, or framework	A book note, a pattern note
Profile	Person or entity; grows over time	`Jane-Smith.md`
Record	Point-in-time capture	Meeting notes, session logs
Playbook	Reusable procedure	`1-on-1-Playbook.md`
Transcript	Raw verbatim source (VTT)	Never read at session start; drill in when exact wording matters
Component	Technology	Purpose
Knowledge base	Obsidian (just the vault, no plugins required)	File storage, browsing, graph view
AI interface	Claude Code CLI	Primary interaction layer
Project instructions	`CLAUDE.md`	System conventions, loaded every session
Entity index	TypeScript + better-sqlite3 + MCP SDK	Knowledge graph tools
External integrations	MCP servers (Slack, Calendar, Drive, Linear)	Data access
Session state	`_current.md`	Lean startup file for each session