Skip to content

Instantly share code, notes, and snippets.

@Shehryar
Created February 26, 2026 19:46
Show Gist options
  • Select an option

  • Save Shehryar/22059e16ba70a42fc72c13fde9f0a612 to your computer and use it in GitHub Desktop.

Select an option

Save Shehryar/22059e16ba70a42fc72c13fde9f0a612 to your computer and use it in GitHub Desktop.
Dumb CLI + Skill Prompt: an architecture for giving AI agents tool access without context bloat

Skill CLI Architecture

Problem

AI agents (Claude Code, Claude Desktop, custom apps) suffer from context bloat when many tools are loaded. MCP servers don't solve this - they just organize where tools come from, but all tool definitions still get injected into every conversation.

Solution: Dumb CLI + Skill Prompt

The recommended pattern is a "dumb" CLI (pure API wrapper, no AI) combined with a Claude Code skill prompt that teaches Claude how to use it.

/gdocs "edit the intro in my Q1 planning doc"
    │
    └── Skill prompt loads (teaches Claude the CLI commands)
            │
            └── Claude Code orchestrates:
                    ├── gdocs-cli list --query "Q1 planning"
                    ├── gdocs-cli read <doc_id>
                    ├── gdocs-cli replace <doc_id> "old text" "new text"
                    └── Returns summary to user

Claude Code handles the reasoning. The CLI just executes API calls.

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                              CLAUDE CODE                                     │
│                                                                             │
│  ┌─────────────────┐                                                        │
│  │  Base Context   │  ← minimal, no tool definitions loaded                 │
│  └─────────────────┘                                                        │
│           │                                                                 │
│           │  user types: /gdocs "fix typos in Project Brief"                │
│           ▼                                                                 │
│  ┌─────────────────┐                                                        │
│  │  Skill Prompt   │  ← loads only when invoked                             │
│  │  (SKILL.md)     │                                                        │
│  │                 │    teaches Claude:                                     │
│  │  - commands     │    • available CLI commands                            │
│  │  - workflows    │    • how to chain them                                 │
│  │  - guidelines   │    • best practices                                    │
│  └─────────────────┘                                                        │
│           │                                                                 │
│           │  Claude orchestrates                                            │
│           ▼                                                                 │
└───────────┼─────────────────────────────────────────────────────────────────┘
            │
            │  bash calls
            ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                              DUMB CLI                                        │
│                                                                             │
│  ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐     │
│  │   cli.py        │      │   service.py    │      │   External API  │     │
│  │                 │      │                 │      │                 │     │
│  │  argparse       │ ──▶  │  API wrapper    │ ──▶  │  Google Docs    │     │
│  │  subcommands    │      │  functions      │      │  Gmail          │     │
│  │  output format  │ ◀──  │  auth handling  │ ◀──  │  Calendar       │     │
│  │                 │      │                 │      │  etc.           │     │
│  └─────────────────┘      └─────────────────┘      └─────────────────┘     │
│                                                                             │
│  NO AI, NO CLAUDE API CALLS                                                 │
│  just: input → API call → structured output                                 │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Flow:
  1. User invokes skill (/gdocs, /email, etc.)
  2. Skill prompt loads into context (teaches Claude the CLI)
  3. Claude reasons about what commands to run
  4. Claude executes CLI commands via bash
  5. CLI calls external APIs, returns structured output
  6. Claude interprets results, continues or summarizes
  7. Skill prompt unloads when task complete

Key insight: the CLI is "dumb" - it has no intelligence.
All reasoning happens in Claude Code, which already exists.
No extra API costs, no context bloat, no sub-agents needed.

Comparison with Other Patterns

┌─────────────────────────────────────────────────────────────────────────────┐
│                         MCP SERVER PATTERN                                   │
│                                                                             │
│  ┌─────────────────┐      ┌─────────────────┐                               │
│  │  Claude Code    │      │  MCP Server     │                               │
│  │                 │      │                 │                               │
│  │  ALL tools      │ ◀──▶ │  tool handlers  │  ← tools always in context    │
│  │  always loaded  │      │                 │                               │
│  └─────────────────┘      └─────────────────┘                               │
│                                                                             │
│  problem: context bloat (20+ tools = slower, more expensive)                │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                      DUMB CLI + SKILL PATTERN                                │
│                                                                             │
│  ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐     │
│  │  Claude Code    │      │  Dumb CLI       │      │  External API   │     │
│  │                 │      │                 │      │                 │     │
│  │  skill loads    │ ──▶  │  no AI          │ ──▶  │  Google, etc.   │     │
│  │  on demand      │ bash │  just API calls │      │                 │     │
│  └─────────────────┘      └─────────────────┘      └─────────────────┘     │
│                                                                             │
│  use for: simple operations, user invokes /skillname                        │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                   TASK AGENT + MULTI-SKILL PATTERN (recommended)             │
│                                                                             │
│  ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐     │
│  │  Claude Code    │      │  Task Agent     │      │  Dumb CLIs      │     │
│  │                 │      │                 │      │                 │     │
│  │  spawns agent   │ ──▶  │  reads skill    │ ──▶  │  bq query       │     │
│  │  via Task tool  │      │  docs on demand │ bash │  gsheets-cli    │     │
│  └─────────────────┘      │                 │      │  slack-cli      │     │
│                           │  ┌───────────┐  │      │  etc.           │     │
│                           │  │ bigquery/ │  │      └─────────────────┘     │
│                           │  │ gsheets/  │  │              │               │
│                           │  │ slack/    │  │              ▼               │
│                           │  └───────────┘  │      ┌─────────────────┐     │
│                           │   skill docs    │      │  External APIs  │     │
│                           └─────────────────┘      └─────────────────┘     │
│                                                                             │
│  use for: complex domains, agent loads relevant skills as needed            │
│  benefit: no extra API costs, specialized reasoning, self-improving         │
└─────────────────────────────────────────────────────────────────────────────┘

Benefits

Aspect MCP Server Sub-agent CLI Dumb CLI + Skill Task Agent + Skill
Context bloat Yes (always) No No No
Extra API costs No Yes (sub-agent) No No
Needs separate API key No Yes No No
Works in Claude Code Yes Yes (Bash) Yes (/skill) Yes (Task tool)
Isolated context No Yes Partial* Partial*
Specialized reasoning No Yes No Yes
Self-improving docs No No Manual Yes

*Skill prompts only load when invoked, keeping base context clean.

Recommended patterns:

  • Simple operations: Dumb CLI + Skill (user invokes /skillname)
  • Complex domains: Task Agent + Skill (agent reads skill docs, orchestrates CLI)

Architecture

Directory Structure

skill-clis/
  gdocs-skill/
    ├── pyproject.toml       # Package config
    ├── README.md
    └── gdocs_skill/
        ├── __init__.py
        ├── cli.py           # CLI entry point (argparse)
        └── service.py       # API client (Google Docs)

~/.claude/commands/
  gdocs.md                   # Claude Code skill prompt

The CLI (Dumb API Wrapper)

The CLI does one thing: execute API operations and return results.

# gdocs_skill/cli.py
def cmd_list(args):
    result = service.list_documents(query=args.query)
    for doc in result["documents"]:
        print(f"{doc['id']}\t{doc['name']}")

def cmd_read(args):
    result = service.read_document(args.document_id)
    print(f"# {result['title']}\n")
    print(result["content"])

def cmd_replace(args):
    result = service.replace_text(args.document_id, args.find, args.replace)
    print(f"Replaced {result['occurrences_replaced']} occurrence(s)")

No AI, no Claude API calls, no tool definitions. Just API in, result out.

The Skill Prompt

Located at ~/.claude/commands/gdocs.md:

# Google Docs

Work with Google Docs using the `gdocs-cli` tool.

## Available Commands

gdocs-cli list [--query "search"]     # List documents
gdocs-cli read <doc_id>               # Read content
gdocs-cli replace <doc_id> "a" "b"    # Replace text
gdocs-cli append <doc_id> "text"      # Append text

## Workflow

1. Find the document (list/search by name)
2. Read if needed to understand content
3. Make targeted edits
4. Confirm what changed

When user types /gdocs "do something", this prompt loads and Claude knows how to use the CLI.

Invocation

From Claude Code

User: /gdocs "fix typos in my Project Brief"

Claude Code:
1. [Reads skill prompt, understands available commands]
2. [Bash] gdocs-cli list --query "Project Brief"
3. [Bash] gdocs-cli read 1AbCdEf...
4. [Bash] gdocs-cli replace 1AbCdEf... "teh" "the"
5. "Fixed 3 typos in Project Brief"

From Terminal

gdocs-cli list
gdocs-cli read 1AbCdEf2GhI3JkLmNoPqRsTuVwXyZ
gdocs-cli replace 1AbCdEf... "draft" "final"

From Custom App

import subprocess
result = subprocess.run(["gdocs-cli", "list"], capture_output=True, text=True)
print(result.stdout)

Authentication & Credentials

All credentials are stored in ~/.agent-skills/ - never in environment variables or shell profiles.

OAuth (Google Services)

  • OAuth tokens cached in ~/.agent-skills/token_<service>.json
  • Credentials from ~/.agent-skills/credentials.json or project root
  • First run triggers browser OAuth flow

API Keys

Store as plain text files in the config directory:

~/.agent-skills/gemini_api_key    # Gemini/Imagen
~/.agent-skills/<service>_api_key # Other services

CLIs should check config file first, fall back to environment variable:

CONFIG_DIR = Path.home() / ".agent-skills"
API_KEY_PATH = CONFIG_DIR / "gemini_api_key"

def get_api_key() -> str:
    if API_KEY_PATH.exists():
        return API_KEY_PATH.read_text().strip()
    return os.environ.get("GEMINI_API_KEY")

Creating a New Skill CLI

  1. Create the service (service.py)

    • Pure API wrapper functions
    • Return dicts with success, data, error keys
    • Handle auth internally
  2. Create the CLI (cli.py)

    • Use argparse with subcommands
    • Print human-readable output (for Claude to parse)
    • Exit codes: 0 success, 1 error
  3. Create the skill prompt (~/.claude/commands/<name>.md)

    • Document available commands
    • Show example workflows
    • Keep it concise
  4. Install

    pip install -e skill-clis/<name>-skill/

Alternative: Task Agent + Multi-Skill Pattern (Recommended)

For complex operations that benefit from specialized behavior, use a Claude Code Task agent that reads skill documentation on demand. The agent can load multiple skills as needed for the task.

User: "analyze last week's orders and share findings in #analytics slack"
    │
    └── Parent Claude spawns data-analyst agent (via Task tool)
            │
            └── Agent determines which skills are needed:
                    │
                    ├── Reads .claude/skills/bigquery/  (for querying)
                    │       • SKILL.md - query commands
                    │       • DATA_CATALOG.md - schemas, patterns
                    │
                    └── Reads .claude/skills/slack/     (for sharing)
                            • SKILL.md - messaging commands
                    │
                    └── Agent orchestrates via Bash:
                            bq query "SELECT ... FROM ..."
                            slack-cli post #analytics "Here's what I found..."
                            │
                            └── Returns summary to user

Multi-Skill Loading

Agents don't need to load all skills upfront. They can:

  1. Start with core skills - skills always needed for the domain (e.g., bigquery for data-analyst)
  2. Load additional skills on demand - when the task requires them (e.g., slack for sharing, gsheets for export)
  3. Skip irrelevant skills - no context wasted on unused capabilities
## Required Skills (always load)
- `.claude/skills/bigquery/` - Core querying capability

## Optional Skills (load when needed)
- `.claude/skills/slack/` - For sharing results to channels
- `.claude/skills/gsheets/` - For exporting to spreadsheets
- `.claude/skills/gdocs/` - For creating reports

This keeps context lean while giving agents access to a wide toolkit.

Agent Definition (.claude/agents/data-analyst.md)

---
name: data-analyst
description: "Use when user asks about data, metrics, or business performance..."
model: opus
---

You are a senior data analyst...

## Required Skills (always load first)

Read these before answering any data question:
- `.claude/skills/bigquery/SKILL.md` - Query commands and SQL guidelines
- `.claude/skills/bigquery/DATA_CATALOG.md` - Table schemas, relationships, patterns

## Optional Skills (load when task requires)

- `.claude/skills/slack/SKILL.md` - When sharing results to Slack channels
- `.claude/skills/gsheets/SKILL.md` - When exporting data to spreadsheets
- `.claude/skills/gdocs/SKILL.md` - When creating written reports

## Workflow

1. **Read required skill docs** (bigquery)
2. **Assess if optional skills needed** (slack, gsheets, etc.)
3. **Load additional skill docs** if task requires sharing/exporting
4. **Query the data** using `bq query` via Bash
5. **Share/export** using relevant CLIs if requested
6. **Interpret and present results**

How It Differs from Sub-agents

Aspect Sub-agent (own API key) Task Agent + Skill
API costs Extra (separate calls) None (same session)
Context isolation Complete Partial (shared tools)
Skill knowledge Hardcoded in CLI Reads docs at runtime
Invocation CLI with own Claude Task tool spawns agent
Updates Redeploy CLI Edit .md files

Benefits of Task Agent + Skill:

  • No separate API key needed
  • No extra API costs
  • Skill docs can be updated without code changes
  • Agent can update docs when discovering new patterns (self-improving)
  • Works within Claude Code's existing Task infrastructure

When to use this pattern:

  • Domain requires specialized reasoning (data analysis, code review, etc.)
  • You want consistent behavior across invocations
  • The skill documentation is substantial (schemas, patterns, examples)
  • You want the agent to maintain/improve its own knowledge base

Skill Documentation Structure

.claude/skills/bigquery/
├── SKILL.md           # Commands, SQL guidelines, example flows
└── DATA_CATALOG.md    # Schemas, relationships, query patterns

The agent reads these files before doing work, ensuring it always has current information about available tables, correct join patterns, metric definitions, etc.

Alternative: Sub-agent Pattern (Legacy)

For truly heavyweight operations where you want complete context isolation, you can have the CLI run its own Claude agent:

# __main__.py (sub-agent version)
def main():
    task = sys.argv[1]
    result = agent.run(
        system_prompt=SKILL_SYSTEM_PROMPT,
        tools=SKILL_TOOLS,
        user_message=task
    )
    print(result.summary)

Tradeoffs:

  • Requires separate ANTHROPIC_API_KEY
  • Additional API costs per invocation
  • But provides true context isolation

Use this only when:

  • You need complete isolation from parent context
  • The operation must hide all intermediate steps
  • The skill needs different tools than the parent context

Prefer Task Agent + Skill pattern for most use cases.

Alternative: MCP Backend Pattern

If you have existing MCP servers, you can use them as the backend for your dumb CLI without injecting tools into Claude's context:

┌─────────────────────────────────────────────────────────────────────────────┐
│  Claude Code                                                                 │
│  (skill prompt loaded, NO MCP tools in context)                             │
└───────────┬─────────────────────────────────────────────────────────────────┘
            │ bash
            ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  Dumb CLI                                                                    │
│  (argparse wrapper, no AI)                                                  │
└───────────┬─────────────────────────────────────────────────────────────────┘
            │ MCP protocol (internal, hidden from Claude)
            ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  MCP Server                                                                  │
│  (existing tool handlers, auth, connection management)                      │
└───────────┬─────────────────────────────────────────────────────────────────┘
            │ API calls
            ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  External API (Google, Slack, databases, etc.)                              │
└─────────────────────────────────────────────────────────────────────────────┘

The CLI acts as a thin wrapper that translates argparse commands into MCP tool calls:

# cli.py - wraps MCP server internally
from mcp import ClientSession

async def cmd_list(args):
    async with ClientSession(server_params) as session:
        result = await session.call_tool("list_documents", {"query": args.query})
        for doc in result["documents"]:
            print(f"{doc['id']}\t{doc['name']}")

Benefits:

  • Reuse existing MCP servers without rewriting
  • No context bloat (MCP tools never exposed to Claude)
  • Still get MCP's auth handling, connection management, etc.
  • Skill prompt teaches Claude the CLI commands, not the underlying MCP tools

When to use this:

  • You already have MCP servers you want to leverage
  • The MCP server handles complex auth/connection logic you don't want to reimplement
  • You want to keep MCP as internal infrastructure, not user-facing

This pattern treats MCP as an implementation detail rather than a Claude-facing interface.

MCP Tool Search (if building MCP servers)

As of January 2025, Claude Code supports MCP Tool Search - a feature that lazy-loads MCP tools when they'd otherwise consume >10% of context. If you decide to build MCP servers instead of CLI-based skills, keep these best practices in mind:

How Tool Search works:

  • Claude Code detects when MCP tool descriptions exceed 10% of context
  • When triggered, tools are loaded via search instead of preloaded
  • The "server instructions" field tells Claude when to search for your tools

Best practices for MCP servers:

  1. Write good server instructions - This field becomes critical with Tool Search. It helps Claude know when to search for your tools (similar to how skill prompts work).
  2. Keep tool descriptions searchable - Concise but descriptive names and descriptions help Claude find the right tool.
  3. Don't rely on tools always being in context - With Tool Search, your tools may not be visible until Claude searches for them.

Why we still prefer CLI + Skill:

  • On-demand loading is explicit (user invokes /gsheets)
  • No dependency on Tool Search heuristics
  • Easier to debug (you can run CLI commands directly)
  • Works the same regardless of how many other MCP servers are installed

Tool Search solves the problem of users with 7+ MCP servers consuming 67k+ tokens. Our skill-based approach sidesteps this entirely by only loading context when explicitly invoked.

Implemented Skills

Skill CLI Prompt Status
Google Docs gdocs-cli /gdocs ✓ Ready
Google Sheets gsheets-cli /gsheets ✓ Ready
Google Slides gslides-cli /gslides ✓ Ready
Image Generation imagen-cli /imagen ✓ Ready
Gmail gmail-cli /email ✓ Ready
Google Calendar gcal-cli /calendar ✓ Ready
Remote Session bash/tmux /remote-session ✓ Ready (global)

Implemented Task Agents

Agent Skill Docs CLI Status
Data Analyst .claude/skills/bigquery/ bq query ✓ Ready

Output Guidelines

CLIs should output structured, parseable text:

# Good - Claude can parse this
1AbCdEf2GhI	Project Brief
2BcDeFg3HiJ	Q1 Planning Doc

# Good - clear confirmation
Replaced 3 occurrence(s)

# Bad - too verbose
Successfully completed the replacement operation. The system found and replaced 3 instances of the specified text pattern in the document...

Keep it terse. Claude will summarize for the user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment