denniswon/gist:c01abc0d3d3e5702a052e66dea11dbea

## gistfile1.txt
# Recall MCP — Self-Hosted Setup Guide

Self-hosted Recall MCP for Claude Code using Upstash Redis (cloud) and the open-source `@joseairosa/recall` NPM package. No local Redis, no Docker, no subscription.

Source: https://github.com/joseairosa/recall

---

## Prerequisites

- Node.js >= 18 (`node --version`)
- npm (`npm --version`)
- Claude Code installed (tested on v2.1.72)
- An Anthropic API key (for embeddings)

---

## Step 1: Create an Upstash Redis Instance

1. Go to https://upstash.com and sign up / log in
2. Create a new Redis database:
   - Name: anything (e.g., `recall-memory`)
   - Region: pick the closest to you
   - TLS: enabled (default)
3. Copy the **Redis URL** from the database details page
   - Format: `rediss://default:<password>@<host>.upstash.io:6379`
   - The `rediss://` prefix (double s) means TLS — this is required for Upstash

**IMPORTANT:** The URL uses `rediss://` (with double s) for TLS, not `redis://`. If you use `redis://` it will fail to connect silently or with a cryptic error.

---

## Step 2: Install Recall Globally

```bash
npm install -g @joseairosa/recall
```

Verify:
```bash
which recall   # should print a path
```

**Why global install?** The recall README specifies `"command": "recall"` (bare binary) for Claude Code, not `npx`. Claude Code spawns MCP server processes in a minimal environment. Global install ensures the binary is on a stable PATH. Using `npx` can also work but adds ~0.5s startup overhead.

**fnm/nvm users:** The global install path depends on your active Node version. If you switch Node versions, you may need to reinstall. The stable binary path is at:
```
~/.local/share/fnm/node-versions/<version>/installation/bin/recall
```

---

## Step 2.5: Patch the Deprecated Model ID

Recall v1.10.1 hardcodes `claude-3-5-haiku-20241022` for Anthropic embeddings, but this model is deprecated (HTTP 404). Patch it to the current Haiku model:

```bash
RECALL_DIST="$(npm root -g)/@joseairosa/recall/dist/chunk-52J47SPM.js"
sed -i.bak 's/claude-3-5-haiku-20241022/claude-haiku-4-5-20251001/g' "$RECALL_DIST"
# Verify: should print 4
grep -c 'claude-haiku-4-5-20251001' "$RECALL_DIST"
```

This patch is overwritten on `npm update`, so re-apply after updates (see "Updating Recall" section).

---

## Step 3: Create the Proxy Wrapper (Critical)

**This is the key step that most guides miss.** Recall v1.10.x exposes 20 tools, but 6 of them have non-conformant JSON schemas (`anyOf` at the root level instead of `"type": "object"`). Claude Code silently rejects ALL tools from a server if any tool has an invalid schema. The server shows "connected" in `/mcp` but zero tools are registered.

The broken tools: `memory_graph`, `memory_template`, `memory_category`, `rlm_process`, `workflow`, `memory_maintain`.

Create a proxy that filters them out:

```bash
cat > ~/.claude/recall-proxy.mjs << 'EOF'
#!/usr/bin/env node
// recall-proxy.mjs — MCP proxy that filters out tools with non-conformant schemas
// from @joseairosa/recall before they reach Claude Code.
//
// Problem: recall exposes tools with `anyOf` at the schema root (no "type": "object").
// Claude Code rejects ALL tools from a server if any tool has a malformed schema.
//
// Solution: intercept tools/list response and remove tools without top-level "type".

import { spawn } from "node:child_process";
import { createInterface } from "node:readline";

// Spawn the real recall server
const child = spawn("recall", [], {
  env: { ...process.env },
  stdio: ["pipe", "pipe", "inherit"], // inherit stderr for debugging
});

child.on("error", (err) => {
  process.stderr.write(`[recall-proxy] Failed to spawn recall: ${err.message}\n`);
  process.exit(1);
});

child.on("exit", (code) => {
  process.exit(code ?? 0);
});

// Forward stdin to child
process.stdin.pipe(child.stdin);

// Intercept child stdout, filter tools/list responses
const rl = createInterface({ input: child.stdout, crlfDelay: Infinity });

rl.on("line", (line) => {
  try {
    const msg = JSON.parse(line);

    // Intercept tools/list response
    if (msg.result?.tools && Array.isArray(msg.result.tools)) {
      const before = msg.result.tools.length;
      msg.result.tools = msg.result.tools.filter((t) => {
        const schema = t.inputSchema || {};
        // Keep tools that have "type": "object" at root
        // Drop tools with anyOf/oneOf root and no type
        if (!schema.type && (schema.anyOf || schema.oneOf)) {
          return false;
        }
        return true;
      });
      const after = msg.result.tools.length;
      if (before !== after) {
        process.stderr.write(
          `[recall-proxy] Filtered ${before - after} tools with non-conformant schemas (${before} -> ${after})\n`
        );
      }
      process.stdout.write(JSON.stringify(msg) + "\n");
    } else {
      // Pass through all other messages unchanged
      process.stdout.write(line + "\n");
    }
  } catch {
    // Not JSON — pass through
    process.stdout.write(line + "\n");
  }
});

// Handle signals
process.on("SIGTERM", () => child.kill("SIGTERM"));
process.on("SIGINT", () => child.kill("SIGINT"));
EOF
chmod +x ~/.claude/recall-proxy.mjs
```

Test the proxy:
```bash
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}
{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}' | \
REDIS_URL="rediss://default:<password>@<host>.upstash.io:6379" \
ANTHROPIC_API_KEY="<your-key>" \
timeout 10 node ~/.claude/recall-proxy.mjs 2>&1
```

You should see:
```
[recall-proxy] Filtered 6 tools with non-conformant schemas (20 -> 14)
```

And the tools/list response should contain 14 tools, all with `"type": "object"` in their schema.

---

## Step 4: Configure ~/.mcp.json

Open or create `~/.mcp.json` and add the recall server entry inside `mcpServers`:

```json
{
  "mcpServers": {
    "recall": {
      "command": "node",
      "args": ["/Users/<your-username>/.claude/recall-proxy.mjs"],
      "env": {
        "REDIS_URL": "rediss://default:<your-password>@<your-host>.upstash.io:6379",
        "ANTHROPIC_API_KEY": "<your-anthropic-api-key>",
        "EMBEDDING_PROVIDER": "anthropic",
        "WORKSPACE_MODE": "hybrid"
      }
    }
  }
}
```

Replace all placeholder values with your actual credentials and username.

**IMPORTANT: Hardcode the actual key values, not env var references.**

```jsonc
// CORRECT — hardcoded values
"ANTHROPIC_API_KEY": "sk-ant-api03-xxxxx..."

// WRONG — env var references don't work in ~/.mcp.json
"ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}"
```

MCP server env blocks in `~/.mcp.json` do NOT expand `${VAR}` shell variables — they are passed as literal strings. Since Claude Code is typically launched from a terminal alias (e.g., `claude-yolo`), there is no shell interpolation happening. Hardcode the actual values directly.

**CRITICAL: Always set `EMBEDDING_PROVIDER` explicitly.** See Gotcha #8 below for why.

**Notes:**
- `ANTHROPIC_API_KEY` is used for generating embeddings (semantic search over memories). Same key format as your Claude API key (`sk-ant-api03-...`).
- Supported embedding providers: `anthropic`, `openai`, `voyage`, `cohere`, `deepseek`, `grok`, `ollama`. Only one API key + provider is needed.

---

## Step 5: Add to enabledMcpjsonServers

In `~/.claude/settings.json`, ensure `recall` is in the `enabledMcpjsonServers` array:

```json
{
  "enabledMcpjsonServers": [
    "context7",
    "exa",
    "github",
    "recall"
  ]
}
```

This pre-approves the MCP server so Claude Code doesn't prompt for permission on each restart. Add your other MCP servers to this list too — it's an allowlist, so any server NOT listed may require interactive approval.

---

## Step 6: Restart Claude Code

MCP servers are loaded at startup. After editing `~/.mcp.json`, you must fully restart Claude Code (exit and reopen, not just `/mcp` toggle).

After restart, the recall tools should appear in the available tools list:
- `recall_relevant_context` — search memory for relevant context
- `analyze_and_remember` — extract and store info from conversation
- `store_memory` — store a memory entry
- `search_memories` — semantic search with filters
- `quick_store_decision` — store a decision with reasoning
- `summarize_session` — create session snapshot
- `auto_session_start` — load context at session start
- Plus 7 more utility tools (14 total)

---

## Step 7: Test It

In Claude Code, try:

```
Store a memory that I prefer Rust for all backend services
```

Then in a new session:

```
What do you know about my coding preferences?
```

---

## Usage Guide (Hybrid Mode)

After setup, recall runs automatically in every Claude Code session. Here's how to use the 14 available tools effectively.

### Core Workflow

**Store important context as you work:**
```
"Remember that Newton uses alloy, not ethers-rs, for all Ethereum interactions"
"Store a decision: we chose Feldman VSS over Shamir because it provides verifiable commitments"
```

**Retrieve context when needed:**
```
"What do I know about the privacy layer architecture?"
"Recall any decisions about consensus protocol design"
```

**End-of-session snapshot:**
```
"Summarize this session" — creates a session snapshot with key decisions and patterns
```

### Tool Reference

| Tool | When to Use | Example |
|------|-------------|---------|
| `store_memory` | Save any fact, pattern, or context | Preferences, conventions, architecture decisions |
| `quick_store_decision` | After making a significant choice | Records decision + reasoning + alternatives considered |
| `search_memories` | Find specific memories by topic | Semantic search with optional filters (type, importance, tags) |
| `recall_relevant_context` | Get context for current task | Pass your current task description, returns relevant memories |
| `auto_session_start` | Load context at session start | Returns recent decisions, active directives, patterns |
| `analyze_and_remember` | After important discussions | Paste conversation text, auto-extracts and stores key info |
| `summarize_session` | End of work session | Creates a snapshot of what was done |
| `get_time_window_context` | "What did I do last 2 hours?" | Time-based retrieval, grouped by type or importance |
| `store_batch_memories` | Bulk import | Store multiple memories in one call |
| `update_memory` | Correct or refine a memory | Update content, importance, tags by memory ID |
| `delete_memory` | Remove outdated/wrong memory | Delete by memory ID |
| `organize_session` | Group related memories | Create named session snapshot from memory IDs |
| `convert_to_global` | Share across all projects | Promote a workspace memory to global |
| `convert_to_workspace` | Restrict a global memory | Demote a global memory to workspace-specific |

### Memory Types

Use `context_type` to categorize memories for filtered retrieval:

| Type | Use For |
|------|---------|
| `directive` | Rules, conventions ("always use snake_case") |
| `decision` | Architectural/design choices with reasoning |
| `code_pattern` | Reusable patterns, idioms, templates |
| `preference` | Personal/team preferences |
| `requirement` | Project requirements, constraints |
| `insight` | Learnings, non-obvious discoveries |
| `error` | Bug patterns, failure modes, fixes |
| `todo` | Deferred work, future plans |
| `information` | General facts (default) |
| `heading` | Section markers for organization |

### Workspace vs Global Memories

In `hybrid` mode (recommended):

- **Workspace-local** (default): Scoped to the current project directory. Newton memories stay in Newton, SDK memories stay in SDK.
- **Global** (`is_global: true`): Visible from every project. Use for cross-cutting preferences, tooling decisions, personal conventions.

**What to store globally:**
- Personal preferences ("I prefer Rust for backends")
- Tooling decisions ("use alloy not ethers-rs across all projects")
- Communication style ("no emojis in code comments")
- Cross-project patterns ("always use HKDF for key derivation")

**What to keep workspace-local:**
- Project-specific architecture decisions
- Codebase conventions unique to that project
- Bug fixes and error patterns specific to that codebase

### Importance Scoring

| Score | Meaning | Examples |
|-------|---------|---------|
| 1-3 | Low — ephemeral context | Temporary debugging notes, session-specific state |
| 4-6 | Medium — useful reference | Code patterns, general conventions |
| 7-8 | High — key decisions | Architecture choices, security decisions |
| 9-10 | Critical — must never forget | Breaking change patterns, security invariants |

Higher importance memories are prioritized in search results. Use `min_importance` filter to focus on high-signal memories.

### Tips

- **Tag consistently**: Use tags like `["architecture", "security"]` for filtered search later
- **Include reasoning**: "We chose X because Y" is more valuable than "We use X"
- **Don't over-store**: Store decisions and patterns, not every line of code
- **Review periodically**: Use `search_memories` to find and delete outdated memories
- **Use `analyze_and_remember`** after complex discussions — it auto-extracts the important bits

---

## Gotchas and Lessons Learned

### 1. The schema bug is the #1 blocker (most people get stuck here)

Recall v1.10.x exposes 6 tools with `anyOf` at the schema root instead of `"type": "object"`. Claude Code's MCP tool validator requires `"type": "object"` at the root of every tool's `inputSchema`. When it encounters an invalid schema, it silently drops ALL tools from that server — not just the broken ones. The server still shows "connected ✓" in `/mcp`, making this extremely confusing to debug.

**Symptom:** Server shows connected in `/mcp`, but no recall tools appear anywhere.

**Fix:** Use the proxy wrapper from Step 3.

### 2. "connected" in /mcp does NOT mean tools are registered

The `/mcp` dialog shows connection status (MCP handshake), not tool registration status. A server can be "connected ✓" with zero tools registered if schema validation fails.

### 3. Hardcode keys — no ${VAR} references

`~/.mcp.json` env blocks pass values as literal strings. `"${ANTHROPIC_API_KEY}"` sends the literal string `${ANTHROPIC_API_KEY}` to the server, not the expanded value. Always hardcode.

### 4. enabledMcpjsonServers is an allowlist

If you add this setting to `settings.json`, it becomes an allowlist. Only servers listed will be auto-approved. If you add `recall` but forget your other servers (context7, github, etc.), those will stop auto-loading.

### 5. fnm/nvm PATH is session-specific

If you use fnm (Fast Node Manager) or nvm, the `node`/`npx`/`recall` binaries are in session-specific PATH directories (e.g., `~/.local/state/fnm_multishells/<session-id>/bin/`). Claude Code spawns MCP servers outside your shell's PATH context. Using `node <absolute-path>` in the MCP config avoids this. The proxy wrapper at `~/.claude/recall-proxy.mjs` uses `node` (resolved by Claude Code) to spawn `recall` (resolved from the process PATH inherited by the proxy).

### 6. Memories are scoped per workspace — and `is_global` needs hybrid mode

Recall stores memories per workspace (project directory). Memories stored while working in `/projects/foo` won't appear when working in `/projects/bar`.

Setting `is_global: true` on a memory stores it in a separate `global:memories:all` Redis set. However, the default workspace mode (`isolated`) **never searches the global set**. Global memories silently become unsearchable.

**Fix:** Either set `"WORKSPACE_MODE": "hybrid"` in the env block (searches both workspace + global), or avoid `is_global: true` and keep memories workspace-scoped (the default).

```json
"env": {
  "REDIS_URL": "rediss://...",
  "ANTHROPIC_API_KEY": "sk-ant-...",
  "EMBEDDING_PROVIDER": "anthropic",
  "WORKSPACE_MODE": "hybrid"
}
```

### 7. Don't install the cloud plugin for self-hosted

The `recall-claude-plugin` (hooks-based plugin from the one-liner install) is for the managed SaaS at recallmcp.com. It requires a `RECALL_API_KEY` (cloud API key) and conflicts with the self-hosted MCP server. If you see HTTP 401 errors from recall hooks, this is the cause.

### 8. Embedding provider auto-detection picks up stale OpenAI keys

Recall auto-detects the embedding provider by checking for API keys in this priority order: `VOYAGE_API_KEY` → `COHERE_API_KEY` → `OPENAI_API_KEY` → `DEEPSEEK_API_KEY` → `GROK_API_KEY` → `ANTHROPIC_API_KEY`. Anthropic is checked **last**.

The `env` block in `~/.mcp.json` **adds to** the inherited shell environment — it doesn't replace it. If your shell has `OPENAI_API_KEY` set globally (from another tool, `.zshrc`, etc.), recall will pick OpenAI over Anthropic even though you explicitly configured `ANTHROPIC_API_KEY` in the MCP config.

**Symptom:** `store_memory` fails with `openai API error: 429 - insufficient_quota` even though you configured an Anthropic key.

**Fix:** Always set `"EMBEDDING_PROVIDER": "anthropic"` explicitly in the env block. This overrides auto-detection entirely, ignoring any inherited API keys.

```json
"env": {
  "REDIS_URL": "rediss://...",
  "ANTHROPIC_API_KEY": "sk-ant-...",
  "EMBEDDING_PROVIDER": "anthropic"
}
```

### 9. Recall hardcodes a deprecated Anthropic model ID

Recall v1.10.1 hardcodes `claude-3-5-haiku-20241022` for its Anthropic embedding provider. This model was deprecated and returns HTTP 404 from the Anthropic API. There is no env var to override it.

**Symptom:** `store_memory` fails with `404 {"type":"error","error":{"type":"not_found_error","message":"model: claude-3-5-haiku-20241022"}}`

**Fix:** Patch the dist file directly:
```bash
RECALL_DIST="$(npm root -g)/@joseairosa/recall/dist/chunk-52J47SPM.js"
sed -i.bak 's/claude-3-5-haiku-20241022/claude-haiku-4-5-20251001/g' "$RECALL_DIST"
```

This replaces 4 occurrences. The `.bak` file is your rollback. This patch is overwritten on `npm update -g @joseairosa/recall`, so re-apply after updates until the upstream fixes it.

### 10. npx vs global install vs direct node path

| Method | Pros | Cons |
|--------|------|------|
| `"command": "npx", "args": ["-y", "@joseairosa/recall"]` | Auto-updates | ~0.5s slower startup, may not resolve in Claude Code's environment |
| `"command": "recall"` (global install) | Fastest startup | Tied to fnm/nvm version, needs reinstall on Node upgrade |
| `"command": "node", "args": ["<path>/recall-proxy.mjs"]` | Works with proxy, reliable | Requires proxy file, manual updates |

**Recommended:** Use the proxy wrapper (`node ~/.claude/recall-proxy.mjs`) which spawns `recall` internally. This gives you schema filtering + reliable startup.

---

## What NOT to Install

**Do NOT install the `recall-claude-plugin`** (the hooks-based plugin from `joseairosa/recall-claude-plugin` or the one-liner curl install). That plugin requires a `recallmcp.com` cloud API key and is for the managed SaaS tier, not self-hosted.

The self-hosted MCP server entry in `~/.mcp.json` + the proxy wrapper is the only thing you need.

---

## Cleanup (if old installs exist)

If you previously installed the cloud plugin or have stale configs, remove all traces:

```bash
# 1. Remove hooks plugin directory
rm -rf ~/.claude/recall/

# 2. Remove plugin cache and marketplace clone
rm -rf ~/.claude/plugins/cache/recall-claude-plugin/
rm -rf ~/.claude/plugins/marketplaces/joseairosa-recall-claude-plugin/

# 3. Remove from plugin registries
python3 -c "
import json
for path in ['installed_plugins.json', 'known_marketplaces.json']:
    full = f'$HOME/.claude/plugins/{path}'
    try:
        with open(full) as f: data = json.load(f)
        key = 'recall@recall-claude-plugin' if 'installed' in path else 'recall-claude-plugin'
        container = data.get('plugins', data) if 'installed' in path else data
        if key in container:
            del container[key]
            with open(full, 'w') as f: json.dump(data, f, indent=2)
            print(f'Removed from {path}')
    except: pass
"

# 4. Remove from settings.json (enabledPlugins + extraKnownMarketplaces)
python3 -c "
import json, os
path = os.path.expanduser('~/.claude/settings.json')
with open(path) as f: data = json.load(f)
changed = False
if 'recall@recall-claude-plugin' in data.get('enabledPlugins', {}):
    del data['enabledPlugins']['recall@recall-claude-plugin']; changed = True
if 'recall-claude-plugin' in data.get('extraKnownMarketplaces', {}):
    del data['extraKnownMarketplaces']['recall-claude-plugin']
    if not data['extraKnownMarketplaces']: del data['extraKnownMarketplaces']
    changed = True
if changed:
    with open(path, 'w') as f: json.dump(data, f, indent=2); f.write('\n')
    print('Cleaned settings.json')
"

# 5. Kill stale recall processes from previous sessions
pkill -f "recall" 2>/dev/null
```

After cleanup, only `~/.mcp.json` recall entry + `~/.claude/recall-proxy.mjs` should remain. Restart Claude Code.

---

## Troubleshooting

| Symptom | Cause | Fix |
|---------|-------|-----|
| Server "connected ✓" in `/mcp` but no tools | Schema validation failure (6 broken tools in v1.10.x) | Use the proxy wrapper (Step 3) |
| `ECONNREFUSED ::1:6379` or `127.0.0.1:6379` | `REDIS_URL` not set or wrong | Verify the env block in `~/.mcp.json` has the correct Upstash URL |
| Tools don't appear after restart | MCP server failed to start | Run the manual test from Step 3 to see the actual error |
| `Invalid API key (HTTP 401)` | The hooks-based cloud plugin is active | Remove `~/.claude/recall/` and disable in settings.json |
| `MaxRetriesPerRequestError` | Redis not reachable or wrong credentials | Verify the Upstash URL and password, check TLS (`rediss://`) |
| `recall: command not found` (in proxy) | Global install not on PATH | Run `npm install -g @joseairosa/recall` or use full path |
| Memories not persisting across sessions | Expected — scoped by workspace | Use `is_global: true` for cross-workspace memories |
| `openai API error: 429 - insufficient_quota` | Shell has `OPENAI_API_KEY` which recall picks over Anthropic | Add `"EMBEDDING_PROVIDER": "anthropic"` to env block in `~/.mcp.json` |
| `404 not_found_error model: claude-3-5-haiku-20241022` | Recall hardcodes a deprecated Anthropic model | Patch dist file (see Step 2.5 and Gotcha #9) |
| `store_memory` succeeds but `search_memories` returns 0 | Memory stored with `is_global: true` but workspace mode is `isolated` | Set `WORKSPACE_MODE=hybrid` in env block, or don't use `is_global: true` |
| Multiple stale recall processes | Previous sessions didn't clean up | `pkill -f "recall"` then restart |

---

## Updating Recall

```bash
# Update global install
npm update -g @joseairosa/recall

# Verify version
npm list -g @joseairosa/recall

# Re-apply model patch (until upstream fixes it)
RECALL_DIST="$(npm root -g)/@joseairosa/recall/dist/chunk-52J47SPM.js"
sed -i.bak 's/claude-3-5-haiku-20241022/claude-haiku-4-5-20251001/g' "$RECALL_DIST"
```

The proxy wrapper is version-independent — it filters based on schema structure, not tool names. It will continue working across recall updates. If a future recall version fixes the schemas, the proxy becomes a harmless no-op (filters 0 tools). The model patch must be re-applied after each update until upstream fixes it.

---

## Architecture

```
Claude Code
  └── starts MCP server: node ~/.claude/recall-proxy.mjs
        └── spawns: recall (global binary)
              └── connects to Upstash Redis (TLS)
              │     └── stores/retrieves memories per workspace
              └── uses Anthropic API for embeddings (semantic search)
        └── filters tools/list response (removes 6 broken schemas)
        └── passes all other MCP messages through unchanged
```

- Memories are scoped per workspace (project directory)
- Embeddings enable semantic search ("find memories about deployment") vs exact match
- No data leaves your control — Upstash is your Redis, Anthropic is only for embeddings
- The same Upstash Redis instance works across all your machines (shared memory)
- The proxy adds negligible overhead (~1ms per tools/list call, runs once at startup)
No results found