Skip to content

Instantly share code, notes, and snippets.

@denniswon
Created March 11, 2026 00:49
Show Gist options
  • Select an option

  • Save denniswon/c01abc0d3d3e5702a052e66dea11dbea to your computer and use it in GitHub Desktop.

Select an option

Save denniswon/c01abc0d3d3e5702a052e66dea11dbea to your computer and use it in GitHub Desktop.
Recall MCP — Self-Hosted Setup Guide
# Recall MCP — Self-Hosted Setup Guide
Self-hosted Recall MCP for Claude Code using Upstash Redis (cloud) and the open-source `@joseairosa/recall` NPM package. No local Redis, no Docker, no subscription.
Source: https://github.com/joseairosa/recall
---
## Prerequisites
- Node.js >= 18 (`node --version`)
- npm (`npm --version`)
- Claude Code installed (tested on v2.1.72)
- An Anthropic API key (for embeddings)
---
## Step 1: Create an Upstash Redis Instance
1. Go to https://upstash.com and sign up / log in
2. Create a new Redis database:
- Name: anything (e.g., `recall-memory`)
- Region: pick the closest to you
- TLS: enabled (default)
3. Copy the **Redis URL** from the database details page
- Format: `rediss://default:<password>@<host>.upstash.io:6379`
- The `rediss://` prefix (double s) means TLS — this is required for Upstash
**IMPORTANT:** The URL uses `rediss://` (with double s) for TLS, not `redis://`. If you use `redis://` it will fail to connect silently or with a cryptic error.
---
## Step 2: Install Recall Globally
```bash
npm install -g @joseairosa/recall
```
Verify:
```bash
which recall # should print a path
```
**Why global install?** The recall README specifies `"command": "recall"` (bare binary) for Claude Code, not `npx`. Claude Code spawns MCP server processes in a minimal environment. Global install ensures the binary is on a stable PATH. Using `npx` can also work but adds ~0.5s startup overhead.
**fnm/nvm users:** The global install path depends on your active Node version. If you switch Node versions, you may need to reinstall. The stable binary path is at:
```
~/.local/share/fnm/node-versions/<version>/installation/bin/recall
```
---
## Step 2.5: Patch the Deprecated Model ID
Recall v1.10.1 hardcodes `claude-3-5-haiku-20241022` for Anthropic embeddings, but this model is deprecated (HTTP 404). Patch it to the current Haiku model:
```bash
RECALL_DIST="$(npm root -g)/@joseairosa/recall/dist/chunk-52J47SPM.js"
sed -i.bak 's/claude-3-5-haiku-20241022/claude-haiku-4-5-20251001/g' "$RECALL_DIST"
# Verify: should print 4
grep -c 'claude-haiku-4-5-20251001' "$RECALL_DIST"
```
This patch is overwritten on `npm update`, so re-apply after updates (see "Updating Recall" section).
---
## Step 3: Create the Proxy Wrapper (Critical)
**This is the key step that most guides miss.** Recall v1.10.x exposes 20 tools, but 6 of them have non-conformant JSON schemas (`anyOf` at the root level instead of `"type": "object"`). Claude Code silently rejects ALL tools from a server if any tool has an invalid schema. The server shows "connected" in `/mcp` but zero tools are registered.
The broken tools: `memory_graph`, `memory_template`, `memory_category`, `rlm_process`, `workflow`, `memory_maintain`.
Create a proxy that filters them out:
```bash
cat > ~/.claude/recall-proxy.mjs << 'EOF'
#!/usr/bin/env node
// recall-proxy.mjs — MCP proxy that filters out tools with non-conformant schemas
// from @joseairosa/recall before they reach Claude Code.
//
// Problem: recall exposes tools with `anyOf` at the schema root (no "type": "object").
// Claude Code rejects ALL tools from a server if any tool has a malformed schema.
//
// Solution: intercept tools/list response and remove tools without top-level "type".
import { spawn } from "node:child_process";
import { createInterface } from "node:readline";
// Spawn the real recall server
const child = spawn("recall", [], {
env: { ...process.env },
stdio: ["pipe", "pipe", "inherit"], // inherit stderr for debugging
});
child.on("error", (err) => {
process.stderr.write(`[recall-proxy] Failed to spawn recall: ${err.message}\n`);
process.exit(1);
});
child.on("exit", (code) => {
process.exit(code ?? 0);
});
// Forward stdin to child
process.stdin.pipe(child.stdin);
// Intercept child stdout, filter tools/list responses
const rl = createInterface({ input: child.stdout, crlfDelay: Infinity });
rl.on("line", (line) => {
try {
const msg = JSON.parse(line);
// Intercept tools/list response
if (msg.result?.tools && Array.isArray(msg.result.tools)) {
const before = msg.result.tools.length;
msg.result.tools = msg.result.tools.filter((t) => {
const schema = t.inputSchema || {};
// Keep tools that have "type": "object" at root
// Drop tools with anyOf/oneOf root and no type
if (!schema.type && (schema.anyOf || schema.oneOf)) {
return false;
}
return true;
});
const after = msg.result.tools.length;
if (before !== after) {
process.stderr.write(
`[recall-proxy] Filtered ${before - after} tools with non-conformant schemas (${before} -> ${after})\n`
);
}
process.stdout.write(JSON.stringify(msg) + "\n");
} else {
// Pass through all other messages unchanged
process.stdout.write(line + "\n");
}
} catch {
// Not JSON — pass through
process.stdout.write(line + "\n");
}
});
// Handle signals
process.on("SIGTERM", () => child.kill("SIGTERM"));
process.on("SIGINT", () => child.kill("SIGINT"));
EOF
chmod +x ~/.claude/recall-proxy.mjs
```
Test the proxy:
```bash
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}
{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}' | \
REDIS_URL="rediss://default:<password>@<host>.upstash.io:6379" \
ANTHROPIC_API_KEY="<your-key>" \
timeout 10 node ~/.claude/recall-proxy.mjs 2>&1
```
You should see:
```
[recall-proxy] Filtered 6 tools with non-conformant schemas (20 -> 14)
```
And the tools/list response should contain 14 tools, all with `"type": "object"` in their schema.
---
## Step 4: Configure ~/.mcp.json
Open or create `~/.mcp.json` and add the recall server entry inside `mcpServers`:
```json
{
"mcpServers": {
"recall": {
"command": "node",
"args": ["/Users/<your-username>/.claude/recall-proxy.mjs"],
"env": {
"REDIS_URL": "rediss://default:<your-password>@<your-host>.upstash.io:6379",
"ANTHROPIC_API_KEY": "<your-anthropic-api-key>",
"EMBEDDING_PROVIDER": "anthropic",
"WORKSPACE_MODE": "hybrid"
}
}
}
}
```
Replace all placeholder values with your actual credentials and username.
**IMPORTANT: Hardcode the actual key values, not env var references.**
```jsonc
// CORRECT — hardcoded values
"ANTHROPIC_API_KEY": "sk-ant-api03-xxxxx..."
// WRONG — env var references don't work in ~/.mcp.json
"ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}"
```
MCP server env blocks in `~/.mcp.json` do NOT expand `${VAR}` shell variables — they are passed as literal strings. Since Claude Code is typically launched from a terminal alias (e.g., `claude-yolo`), there is no shell interpolation happening. Hardcode the actual values directly.
**CRITICAL: Always set `EMBEDDING_PROVIDER` explicitly.** See Gotcha #8 below for why.
**Notes:**
- `ANTHROPIC_API_KEY` is used for generating embeddings (semantic search over memories). Same key format as your Claude API key (`sk-ant-api03-...`).
- Supported embedding providers: `anthropic`, `openai`, `voyage`, `cohere`, `deepseek`, `grok`, `ollama`. Only one API key + provider is needed.
---
## Step 5: Add to enabledMcpjsonServers
In `~/.claude/settings.json`, ensure `recall` is in the `enabledMcpjsonServers` array:
```json
{
"enabledMcpjsonServers": [
"context7",
"exa",
"github",
"recall"
]
}
```
This pre-approves the MCP server so Claude Code doesn't prompt for permission on each restart. Add your other MCP servers to this list too — it's an allowlist, so any server NOT listed may require interactive approval.
---
## Step 6: Restart Claude Code
MCP servers are loaded at startup. After editing `~/.mcp.json`, you must fully restart Claude Code (exit and reopen, not just `/mcp` toggle).
After restart, the recall tools should appear in the available tools list:
- `recall_relevant_context` — search memory for relevant context
- `analyze_and_remember` — extract and store info from conversation
- `store_memory` — store a memory entry
- `search_memories` — semantic search with filters
- `quick_store_decision` — store a decision with reasoning
- `summarize_session` — create session snapshot
- `auto_session_start` — load context at session start
- Plus 7 more utility tools (14 total)
---
## Step 7: Test It
In Claude Code, try:
```
Store a memory that I prefer Rust for all backend services
```
Then in a new session:
```
What do you know about my coding preferences?
```
---
## Usage Guide (Hybrid Mode)
After setup, recall runs automatically in every Claude Code session. Here's how to use the 14 available tools effectively.
### Core Workflow
**Store important context as you work:**
```
"Remember that Newton uses alloy, not ethers-rs, for all Ethereum interactions"
"Store a decision: we chose Feldman VSS over Shamir because it provides verifiable commitments"
```
**Retrieve context when needed:**
```
"What do I know about the privacy layer architecture?"
"Recall any decisions about consensus protocol design"
```
**End-of-session snapshot:**
```
"Summarize this session" — creates a session snapshot with key decisions and patterns
```
### Tool Reference
| Tool | When to Use | Example |
|------|-------------|---------|
| `store_memory` | Save any fact, pattern, or context | Preferences, conventions, architecture decisions |
| `quick_store_decision` | After making a significant choice | Records decision + reasoning + alternatives considered |
| `search_memories` | Find specific memories by topic | Semantic search with optional filters (type, importance, tags) |
| `recall_relevant_context` | Get context for current task | Pass your current task description, returns relevant memories |
| `auto_session_start` | Load context at session start | Returns recent decisions, active directives, patterns |
| `analyze_and_remember` | After important discussions | Paste conversation text, auto-extracts and stores key info |
| `summarize_session` | End of work session | Creates a snapshot of what was done |
| `get_time_window_context` | "What did I do last 2 hours?" | Time-based retrieval, grouped by type or importance |
| `store_batch_memories` | Bulk import | Store multiple memories in one call |
| `update_memory` | Correct or refine a memory | Update content, importance, tags by memory ID |
| `delete_memory` | Remove outdated/wrong memory | Delete by memory ID |
| `organize_session` | Group related memories | Create named session snapshot from memory IDs |
| `convert_to_global` | Share across all projects | Promote a workspace memory to global |
| `convert_to_workspace` | Restrict a global memory | Demote a global memory to workspace-specific |
### Memory Types
Use `context_type` to categorize memories for filtered retrieval:
| Type | Use For |
|------|---------|
| `directive` | Rules, conventions ("always use snake_case") |
| `decision` | Architectural/design choices with reasoning |
| `code_pattern` | Reusable patterns, idioms, templates |
| `preference` | Personal/team preferences |
| `requirement` | Project requirements, constraints |
| `insight` | Learnings, non-obvious discoveries |
| `error` | Bug patterns, failure modes, fixes |
| `todo` | Deferred work, future plans |
| `information` | General facts (default) |
| `heading` | Section markers for organization |
### Workspace vs Global Memories
In `hybrid` mode (recommended):
- **Workspace-local** (default): Scoped to the current project directory. Newton memories stay in Newton, SDK memories stay in SDK.
- **Global** (`is_global: true`): Visible from every project. Use for cross-cutting preferences, tooling decisions, personal conventions.
**What to store globally:**
- Personal preferences ("I prefer Rust for backends")
- Tooling decisions ("use alloy not ethers-rs across all projects")
- Communication style ("no emojis in code comments")
- Cross-project patterns ("always use HKDF for key derivation")
**What to keep workspace-local:**
- Project-specific architecture decisions
- Codebase conventions unique to that project
- Bug fixes and error patterns specific to that codebase
### Importance Scoring
| Score | Meaning | Examples |
|-------|---------|---------|
| 1-3 | Low — ephemeral context | Temporary debugging notes, session-specific state |
| 4-6 | Medium — useful reference | Code patterns, general conventions |
| 7-8 | High — key decisions | Architecture choices, security decisions |
| 9-10 | Critical — must never forget | Breaking change patterns, security invariants |
Higher importance memories are prioritized in search results. Use `min_importance` filter to focus on high-signal memories.
### Tips
- **Tag consistently**: Use tags like `["architecture", "security"]` for filtered search later
- **Include reasoning**: "We chose X because Y" is more valuable than "We use X"
- **Don't over-store**: Store decisions and patterns, not every line of code
- **Review periodically**: Use `search_memories` to find and delete outdated memories
- **Use `analyze_and_remember`** after complex discussions — it auto-extracts the important bits
---
## Gotchas and Lessons Learned
### 1. The schema bug is the #1 blocker (most people get stuck here)
Recall v1.10.x exposes 6 tools with `anyOf` at the schema root instead of `"type": "object"`. Claude Code's MCP tool validator requires `"type": "object"` at the root of every tool's `inputSchema`. When it encounters an invalid schema, it silently drops ALL tools from that server — not just the broken ones. The server still shows "connected ✓" in `/mcp`, making this extremely confusing to debug.
**Symptom:** Server shows connected in `/mcp`, but no recall tools appear anywhere.
**Fix:** Use the proxy wrapper from Step 3.
### 2. "connected" in /mcp does NOT mean tools are registered
The `/mcp` dialog shows connection status (MCP handshake), not tool registration status. A server can be "connected ✓" with zero tools registered if schema validation fails.
### 3. Hardcode keys — no ${VAR} references
`~/.mcp.json` env blocks pass values as literal strings. `"${ANTHROPIC_API_KEY}"` sends the literal string `${ANTHROPIC_API_KEY}` to the server, not the expanded value. Always hardcode.
### 4. enabledMcpjsonServers is an allowlist
If you add this setting to `settings.json`, it becomes an allowlist. Only servers listed will be auto-approved. If you add `recall` but forget your other servers (context7, github, etc.), those will stop auto-loading.
### 5. fnm/nvm PATH is session-specific
If you use fnm (Fast Node Manager) or nvm, the `node`/`npx`/`recall` binaries are in session-specific PATH directories (e.g., `~/.local/state/fnm_multishells/<session-id>/bin/`). Claude Code spawns MCP servers outside your shell's PATH context. Using `node <absolute-path>` in the MCP config avoids this. The proxy wrapper at `~/.claude/recall-proxy.mjs` uses `node` (resolved by Claude Code) to spawn `recall` (resolved from the process PATH inherited by the proxy).
### 6. Memories are scoped per workspace — and `is_global` needs hybrid mode
Recall stores memories per workspace (project directory). Memories stored while working in `/projects/foo` won't appear when working in `/projects/bar`.
Setting `is_global: true` on a memory stores it in a separate `global:memories:all` Redis set. However, the default workspace mode (`isolated`) **never searches the global set**. Global memories silently become unsearchable.
**Fix:** Either set `"WORKSPACE_MODE": "hybrid"` in the env block (searches both workspace + global), or avoid `is_global: true` and keep memories workspace-scoped (the default).
```json
"env": {
"REDIS_URL": "rediss://...",
"ANTHROPIC_API_KEY": "sk-ant-...",
"EMBEDDING_PROVIDER": "anthropic",
"WORKSPACE_MODE": "hybrid"
}
```
### 7. Don't install the cloud plugin for self-hosted
The `recall-claude-plugin` (hooks-based plugin from the one-liner install) is for the managed SaaS at recallmcp.com. It requires a `RECALL_API_KEY` (cloud API key) and conflicts with the self-hosted MCP server. If you see HTTP 401 errors from recall hooks, this is the cause.
### 8. Embedding provider auto-detection picks up stale OpenAI keys
Recall auto-detects the embedding provider by checking for API keys in this priority order: `VOYAGE_API_KEY` → `COHERE_API_KEY` → `OPENAI_API_KEY` → `DEEPSEEK_API_KEY` → `GROK_API_KEY` → `ANTHROPIC_API_KEY`. Anthropic is checked **last**.
The `env` block in `~/.mcp.json` **adds to** the inherited shell environment — it doesn't replace it. If your shell has `OPENAI_API_KEY` set globally (from another tool, `.zshrc`, etc.), recall will pick OpenAI over Anthropic even though you explicitly configured `ANTHROPIC_API_KEY` in the MCP config.
**Symptom:** `store_memory` fails with `openai API error: 429 - insufficient_quota` even though you configured an Anthropic key.
**Fix:** Always set `"EMBEDDING_PROVIDER": "anthropic"` explicitly in the env block. This overrides auto-detection entirely, ignoring any inherited API keys.
```json
"env": {
"REDIS_URL": "rediss://...",
"ANTHROPIC_API_KEY": "sk-ant-...",
"EMBEDDING_PROVIDER": "anthropic"
}
```
### 9. Recall hardcodes a deprecated Anthropic model ID
Recall v1.10.1 hardcodes `claude-3-5-haiku-20241022` for its Anthropic embedding provider. This model was deprecated and returns HTTP 404 from the Anthropic API. There is no env var to override it.
**Symptom:** `store_memory` fails with `404 {"type":"error","error":{"type":"not_found_error","message":"model: claude-3-5-haiku-20241022"}}`
**Fix:** Patch the dist file directly:
```bash
RECALL_DIST="$(npm root -g)/@joseairosa/recall/dist/chunk-52J47SPM.js"
sed -i.bak 's/claude-3-5-haiku-20241022/claude-haiku-4-5-20251001/g' "$RECALL_DIST"
```
This replaces 4 occurrences. The `.bak` file is your rollback. This patch is overwritten on `npm update -g @joseairosa/recall`, so re-apply after updates until the upstream fixes it.
### 10. npx vs global install vs direct node path
| Method | Pros | Cons |
|--------|------|------|
| `"command": "npx", "args": ["-y", "@joseairosa/recall"]` | Auto-updates | ~0.5s slower startup, may not resolve in Claude Code's environment |
| `"command": "recall"` (global install) | Fastest startup | Tied to fnm/nvm version, needs reinstall on Node upgrade |
| `"command": "node", "args": ["<path>/recall-proxy.mjs"]` | Works with proxy, reliable | Requires proxy file, manual updates |
**Recommended:** Use the proxy wrapper (`node ~/.claude/recall-proxy.mjs`) which spawns `recall` internally. This gives you schema filtering + reliable startup.
---
## What NOT to Install
**Do NOT install the `recall-claude-plugin`** (the hooks-based plugin from `joseairosa/recall-claude-plugin` or the one-liner curl install). That plugin requires a `recallmcp.com` cloud API key and is for the managed SaaS tier, not self-hosted.
The self-hosted MCP server entry in `~/.mcp.json` + the proxy wrapper is the only thing you need.
---
## Cleanup (if old installs exist)
If you previously installed the cloud plugin or have stale configs, remove all traces:
```bash
# 1. Remove hooks plugin directory
rm -rf ~/.claude/recall/
# 2. Remove plugin cache and marketplace clone
rm -rf ~/.claude/plugins/cache/recall-claude-plugin/
rm -rf ~/.claude/plugins/marketplaces/joseairosa-recall-claude-plugin/
# 3. Remove from plugin registries
python3 -c "
import json
for path in ['installed_plugins.json', 'known_marketplaces.json']:
full = f'$HOME/.claude/plugins/{path}'
try:
with open(full) as f: data = json.load(f)
key = 'recall@recall-claude-plugin' if 'installed' in path else 'recall-claude-plugin'
container = data.get('plugins', data) if 'installed' in path else data
if key in container:
del container[key]
with open(full, 'w') as f: json.dump(data, f, indent=2)
print(f'Removed from {path}')
except: pass
"
# 4. Remove from settings.json (enabledPlugins + extraKnownMarketplaces)
python3 -c "
import json, os
path = os.path.expanduser('~/.claude/settings.json')
with open(path) as f: data = json.load(f)
changed = False
if 'recall@recall-claude-plugin' in data.get('enabledPlugins', {}):
del data['enabledPlugins']['recall@recall-claude-plugin']; changed = True
if 'recall-claude-plugin' in data.get('extraKnownMarketplaces', {}):
del data['extraKnownMarketplaces']['recall-claude-plugin']
if not data['extraKnownMarketplaces']: del data['extraKnownMarketplaces']
changed = True
if changed:
with open(path, 'w') as f: json.dump(data, f, indent=2); f.write('\n')
print('Cleaned settings.json')
"
# 5. Kill stale recall processes from previous sessions
pkill -f "recall" 2>/dev/null
```
After cleanup, only `~/.mcp.json` recall entry + `~/.claude/recall-proxy.mjs` should remain. Restart Claude Code.
---
## Troubleshooting
| Symptom | Cause | Fix |
|---------|-------|-----|
| Server "connected ✓" in `/mcp` but no tools | Schema validation failure (6 broken tools in v1.10.x) | Use the proxy wrapper (Step 3) |
| `ECONNREFUSED ::1:6379` or `127.0.0.1:6379` | `REDIS_URL` not set or wrong | Verify the env block in `~/.mcp.json` has the correct Upstash URL |
| Tools don't appear after restart | MCP server failed to start | Run the manual test from Step 3 to see the actual error |
| `Invalid API key (HTTP 401)` | The hooks-based cloud plugin is active | Remove `~/.claude/recall/` and disable in settings.json |
| `MaxRetriesPerRequestError` | Redis not reachable or wrong credentials | Verify the Upstash URL and password, check TLS (`rediss://`) |
| `recall: command not found` (in proxy) | Global install not on PATH | Run `npm install -g @joseairosa/recall` or use full path |
| Memories not persisting across sessions | Expected — scoped by workspace | Use `is_global: true` for cross-workspace memories |
| `openai API error: 429 - insufficient_quota` | Shell has `OPENAI_API_KEY` which recall picks over Anthropic | Add `"EMBEDDING_PROVIDER": "anthropic"` to env block in `~/.mcp.json` |
| `404 not_found_error model: claude-3-5-haiku-20241022` | Recall hardcodes a deprecated Anthropic model | Patch dist file (see Step 2.5 and Gotcha #9) |
| `store_memory` succeeds but `search_memories` returns 0 | Memory stored with `is_global: true` but workspace mode is `isolated` | Set `WORKSPACE_MODE=hybrid` in env block, or don't use `is_global: true` |
| Multiple stale recall processes | Previous sessions didn't clean up | `pkill -f "recall"` then restart |
---
## Updating Recall
```bash
# Update global install
npm update -g @joseairosa/recall
# Verify version
npm list -g @joseairosa/recall
# Re-apply model patch (until upstream fixes it)
RECALL_DIST="$(npm root -g)/@joseairosa/recall/dist/chunk-52J47SPM.js"
sed -i.bak 's/claude-3-5-haiku-20241022/claude-haiku-4-5-20251001/g' "$RECALL_DIST"
```
The proxy wrapper is version-independent — it filters based on schema structure, not tool names. It will continue working across recall updates. If a future recall version fixes the schemas, the proxy becomes a harmless no-op (filters 0 tools). The model patch must be re-applied after each update until upstream fixes it.
---
## Architecture
```
Claude Code
└── starts MCP server: node ~/.claude/recall-proxy.mjs
└── spawns: recall (global binary)
└── connects to Upstash Redis (TLS)
│ └── stores/retrieves memories per workspace
└── uses Anthropic API for embeddings (semantic search)
└── filters tools/list response (removes 6 broken schemas)
└── passes all other MCP messages through unchanged
```
- Memories are scoped per workspace (project directory)
- Embeddings enable semantic search ("find memories about deployment") vs exact match
- No data leaves your control — Upstash is your Redis, Anthropic is only for embeddings
- The same Upstash Redis instance works across all your machines (shared memory)
- The proxy adds negligible overhead (~1ms per tools/list call, runs once at startup)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment