Shehryar/SKILL_CLI_ARCHITECTURE.md

## SKILL_CLI_ARCHITECTURE.md

      
    Raw
  

              SKILL_CLI_ARCHITECTURE.md
            
          
    Skill CLI Architecture

Problem

AI agents (Claude Code, Claude Desktop, custom apps) suffer from context bloat when many tools are loaded. MCP servers don't solve this - they just organize where tools come from, but all tool definitions still get injected into every conversation.
Solution: Dumb CLI + Skill Prompt

The recommended pattern is a "dumb" CLI (pure API wrapper, no AI) combined with a Claude Code skill prompt that teaches Claude how to use it.
/gdocs "edit the intro in my Q1 planning doc"
    │
    └── Skill prompt loads (teaches Claude the CLI commands)
            │
            └── Claude Code orchestrates:
                    ├── gdocs-cli list --query "Q1 planning"
                    ├── gdocs-cli read <doc_id>
                    ├── gdocs-cli replace <doc_id> "old text" "new text"
                    └── Returns summary to user

Claude Code handles the reasoning. The CLI just executes API calls.
Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                              CLAUDE CODE                                     │
│                                                                             │
│  ┌─────────────────┐                                                        │
│  │  Base Context   │  ← minimal, no tool definitions loaded                 │
│  └─────────────────┘                                                        │
│           │                                                                 │
│           │  user types: /gdocs "fix typos in Project Brief"                │
│           ▼                                                                 │
│  ┌─────────────────┐                                                        │
│  │  Skill Prompt   │  ← loads only when invoked                             │
│  │  (SKILL.md)     │                                                        │
│  │                 │    teaches Claude:                                     │
│  │  - commands     │    • available CLI commands                            │
│  │  - workflows    │    • how to chain them                                 │
│  │  - guidelines   │    • best practices                                    │
│  └─────────────────┘                                                        │
│           │                                                                 │
│           │  Claude orchestrates                                            │
│           ▼                                                                 │
└───────────┼─────────────────────────────────────────────────────────────────┘
            │
            │  bash calls
            ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                              DUMB CLI                                        │
│                                                                             │
│  ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐     │
│  │   cli.py        │      │   service.py    │      │   External API  │     │
│  │                 │      │                 │      │                 │     │
│  │  argparse       │ ──▶  │  API wrapper    │ ──▶  │  Google Docs    │     │
│  │  subcommands    │      │  functions      │      │  Gmail          │     │
│  │  output format  │ ◀──  │  auth handling  │ ◀──  │  Calendar       │     │
│  │                 │      │                 │      │  etc.           │     │
│  └─────────────────┘      └─────────────────┘      └─────────────────┘     │
│                                                                             │
│  NO AI, NO CLAUDE API CALLS                                                 │
│  just: input → API call → structured output                                 │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Flow:
  1. User invokes skill (/gdocs, /email, etc.)
  2. Skill prompt loads into context (teaches Claude the CLI)
  3. Claude reasons about what commands to run
  4. Claude executes CLI commands via bash
  5. CLI calls external APIs, returns structured output
  6. Claude interprets results, continues or summarizes
  7. Skill prompt unloads when task complete

Key insight: the CLI is "dumb" - it has no intelligence.
All reasoning happens in Claude Code, which already exists.
No extra API costs, no context bloat, no sub-agents needed.

Comparison with Other Patterns

┌─────────────────────────────────────────────────────────────────────────────┐
│                         MCP SERVER PATTERN                                   │
│                                                                             │
│  ┌─────────────────┐      ┌─────────────────┐                               │
│  │  Claude Code    │      │  MCP Server     │                               │
│  │                 │      │                 │                               │
│  │  ALL tools      │ ◀──▶ │  tool handlers  │  ← tools always in context    │
│  │  always loaded  │      │                 │                               │
│  └─────────────────┘      └─────────────────┘                               │
│                                                                             │
│  problem: context bloat (20+ tools = slower, more expensive)                │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                      DUMB CLI + SKILL PATTERN                                │
│                                                                             │
│  ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐     │
│  │  Claude Code    │      │  Dumb CLI       │      │  External API   │     │
│  │                 │      │                 │      │                 │     │
│  │  skill loads    │ ──▶  │  no AI          │ ──▶  │  Google, etc.   │     │
│  │  on demand      │ bash │  just API calls │      │                 │     │
│  └─────────────────┘      └─────────────────┘      └─────────────────┘     │
│                                                                             │
│  use for: simple operations, user invokes /skillname                        │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                   TASK AGENT + MULTI-SKILL PATTERN (recommended)             │
│                                                                             │
│  ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐     │
│  │  Claude Code    │      │  Task Agent     │      │  Dumb CLIs      │     │
│  │                 │      │                 │      │                 │     │
│  │  spawns agent   │ ──▶  │  reads skill    │ ──▶  │  bq query       │     │
│  │  via Task tool  │      │  docs on demand │ bash │  gsheets-cli    │     │
│  └─────────────────┘      │                 │      │  slack-cli      │     │
│                           │  ┌───────────┐  │      │  etc.           │     │
│                           │  │ bigquery/ │  │      └─────────────────┘     │
│                           │  │ gsheets/  │  │              │               │
│                           │  │ slack/    │  │              ▼               │
│                           │  └───────────┘  │      ┌─────────────────┐     │
│                           │   skill docs    │      │  External APIs  │     │
│                           └─────────────────┘      └─────────────────┘     │
│                                                                             │
│  use for: complex domains, agent loads relevant skills as needed            │
│  benefit: no extra API costs, specialized reasoning, self-improving         │
└─────────────────────────────────────────────────────────────────────────────┘

Benefits


Aspect
MCP Server
Sub-agent CLI
Dumb CLI + Skill
Task Agent + Skill


Context bloat
Yes (always)
No
No
No


Extra API costs
No
Yes (sub-agent)
No
No


Needs separate API key
No
Yes
No
No


Works in Claude Code
Yes
Yes (Bash)
Yes (/skill)
Yes (Task tool)


Isolated context
No
Yes
Partial*
Partial*


Specialized reasoning
No
Yes
No
Yes


Self-improving docs
No
No
Manual
Yes


*Skill prompts only load when invoked, keeping base context clean.
Recommended patterns:

Simple operations: Dumb CLI + Skill (user invokes /skillname)
Complex domains: Task Agent + Skill (agent reads skill docs, orchestrates CLI)

Architecture

Directory Structure

skill-clis/
  gdocs-skill/
    ├── pyproject.toml       # Package config
    ├── README.md
    └── gdocs_skill/
        ├── __init__.py
        ├── cli.py           # CLI entry point (argparse)
        └── service.py       # API client (Google Docs)

~/.claude/commands/
  gdocs.md                   # Claude Code skill prompt

The CLI (Dumb API Wrapper)

The CLI does one thing: execute API operations and return results.
# gdocs_skill/cli.py
def cmd_list(args):
    result = service.list_documents(query=args.query)
    for doc in result["documents"]:
        print(f"{doc['id']}\t{doc['name']}")

def cmd_read(args):
    result = service.read_document(args.document_id)
    print(f"# {result['title']}\n")
    print(result["content"])

def cmd_replace(args):
    result = service.replace_text(args.document_id, args.find, args.replace)
    print(f"Replaced {result['occurrences_replaced']} occurrence(s)")
No AI, no Claude API calls, no tool definitions. Just API in, result out.
The Skill Prompt

Located at ~/.claude/commands/gdocs.md:
# Google Docs

Work with Google Docs using the `gdocs-cli` tool.

## Available Commands

gdocs-cli list [--query "search"]     # List documents
gdocs-cli read <doc_id>               # Read content
gdocs-cli replace <doc_id> "a" "b"    # Replace text
gdocs-cli append <doc_id> "text"      # Append text

## Workflow

1. Find the document (list/search by name)
2. Read if needed to understand content
3. Make targeted edits
4. Confirm what changed
When user types /gdocs "do something", this prompt loads and Claude knows how to use the CLI.
Invocation

From Claude Code

User: /gdocs "fix typos in my Project Brief"

Claude Code:
1. [Reads skill prompt, understands available commands]
2. [Bash] gdocs-cli list --query "Project Brief"
3. [Bash] gdocs-cli read 1AbCdEf...
4. [Bash] gdocs-cli replace 1AbCdEf... "teh" "the"
5. "Fixed 3 typos in Project Brief"

From Terminal

gdocs-cli list
gdocs-cli read 1AbCdEf2GhI3JkLmNoPqRsTuVwXyZ
gdocs-cli replace 1AbCdEf... "draft" "final"
From Custom App

import subprocess
result = subprocess.run(["gdocs-cli", "list"], capture_output=True, text=True)
print(result.stdout)
Authentication & Credentials

All credentials are stored in ~/.agent-skills/ - never in environment variables or shell profiles.
OAuth (Google Services)


OAuth tokens cached in ~/.agent-skills/token_<service>.json
Credentials from ~/.agent-skills/credentials.json or project root
First run triggers browser OAuth flow

API Keys

Store as plain text files in the config directory:
~/.agent-skills/gemini_api_key    # Gemini/Imagen
~/.agent-skills/<service>_api_key # Other services
CLIs should check config file first, fall back to environment variable:
CONFIG_DIR = Path.home() / ".agent-skills"
API_KEY_PATH = CONFIG_DIR / "gemini_api_key"

def get_api_key() -> str:
    if API_KEY_PATH.exists():
        return API_KEY_PATH.read_text().strip()
    return os.environ.get("GEMINI_API_KEY")
Creating a New Skill CLI


Create the service (service.py)

Pure API wrapper functions
Return dicts with success, data, error keys
Handle auth internally


Create the CLI (cli.py)

Use argparse with subcommands
Print human-readable output (for Claude to parse)
Exit codes: 0 success, 1 error


Create the skill prompt (~/.claude/commands/<name>.md)

Document available commands
Show example workflows
Keep it concise


Install
pip install -e skill-clis/<name>-skill/


Alternative: Task Agent + Multi-Skill Pattern (Recommended)

For complex operations that benefit from specialized behavior, use a Claude Code Task agent that reads skill documentation on demand. The agent can load multiple skills as needed for the task.
User: "analyze last week's orders and share findings in #analytics slack"
    │
    └── Parent Claude spawns data-analyst agent (via Task tool)
            │
            └── Agent determines which skills are needed:
                    │
                    ├── Reads .claude/skills/bigquery/  (for querying)
                    │       • SKILL.md - query commands
                    │       • DATA_CATALOG.md - schemas, patterns
                    │
                    └── Reads .claude/skills/slack/     (for sharing)
                            • SKILL.md - messaging commands
                    │
                    └── Agent orchestrates via Bash:
                            bq query "SELECT ... FROM ..."
                            slack-cli post #analytics "Here's what I found..."
                            │
                            └── Returns summary to user

Multi-Skill Loading

Agents don't need to load all skills upfront. They can:

Start with core skills - skills always needed for the domain (e.g., bigquery for data-analyst)
Load additional skills on demand - when the task requires them (e.g., slack for sharing, gsheets for export)
Skip irrelevant skills - no context wasted on unused capabilities

## Required Skills (always load)
- `.claude/skills/bigquery/` - Core querying capability

## Optional Skills (load when needed)
- `.claude/skills/slack/` - For sharing results to channels
- `.claude/skills/gsheets/` - For exporting to spreadsheets
- `.claude/skills/gdocs/` - For creating reports
This keeps context lean while giving agents access to a wide toolkit.
Agent Definition (.claude/agents/data-analyst.md)

---
name: data-analyst
description: "Use when user asks about data, metrics, or business performance..."
model: opus
---

You are a senior data analyst...

## Required Skills (always load first)

Read these before answering any data question:
- `.claude/skills/bigquery/SKILL.md` - Query commands and SQL guidelines
- `.claude/skills/bigquery/DATA_CATALOG.md` - Table schemas, relationships, patterns

## Optional Skills (load when task requires)

- `.claude/skills/slack/SKILL.md` - When sharing results to Slack channels
- `.claude/skills/gsheets/SKILL.md` - When exporting data to spreadsheets
- `.claude/skills/gdocs/SKILL.md` - When creating written reports

## Workflow

1. **Read required skill docs** (bigquery)
2. **Assess if optional skills needed** (slack, gsheets, etc.)
3. **Load additional skill docs** if task requires sharing/exporting
4. **Query the data** using `bq query` via Bash
5. **Share/export** using relevant CLIs if requested
6. **Interpret and present results**
How It Differs from Sub-agents


Aspect
Sub-agent (own API key)
Task Agent + Skill


API costs
Extra (separate calls)
None (same session)


Context isolation
Complete
Partial (shared tools)


Skill knowledge
Hardcoded in CLI
Reads docs at runtime


Invocation
CLI with own Claude
Task tool spawns agent


Updates
Redeploy CLI
Edit .md files


Benefits of Task Agent + Skill:

No separate API key needed
No extra API costs
Skill docs can be updated without code changes
Agent can update docs when discovering new patterns (self-improving)
Works within Claude Code's existing Task infrastructure

When to use this pattern:

Domain requires specialized reasoning (data analysis, code review, etc.)
You want consistent behavior across invocations
The skill documentation is substantial (schemas, patterns, examples)
You want the agent to maintain/improve its own knowledge base

Skill Documentation Structure

.claude/skills/bigquery/
├── SKILL.md           # Commands, SQL guidelines, example flows
└── DATA_CATALOG.md    # Schemas, relationships, query patterns

The agent reads these files before doing work, ensuring it always has current information about available tables, correct join patterns, metric definitions, etc.
Alternative: Sub-agent Pattern (Legacy)

For truly heavyweight operations where you want complete context isolation, you can have the CLI run its own Claude agent:
# __main__.py (sub-agent version)
def main():
    task = sys.argv[1]
    result = agent.run(
        system_prompt=SKILL_SYSTEM_PROMPT,
        tools=SKILL_TOOLS,
        user_message=task
    )
    print(result.summary)
Tradeoffs:

Requires separate ANTHROPIC_API_KEY
Additional API costs per invocation
But provides true context isolation

Use this only when:

You need complete isolation from parent context
The operation must hide all intermediate steps
The skill needs different tools than the parent context

Prefer Task Agent + Skill pattern for most use cases.
Alternative: MCP Backend Pattern

If you have existing MCP servers, you can use them as the backend for your dumb CLI without injecting tools into Claude's context:
┌─────────────────────────────────────────────────────────────────────────────┐
│  Claude Code                                                                 │
│  (skill prompt loaded, NO MCP tools in context)                             │
└───────────┬─────────────────────────────────────────────────────────────────┘
            │ bash
            ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  Dumb CLI                                                                    │
│  (argparse wrapper, no AI)                                                  │
└───────────┬─────────────────────────────────────────────────────────────────┘
            │ MCP protocol (internal, hidden from Claude)
            ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  MCP Server                                                                  │
│  (existing tool handlers, auth, connection management)                      │
└───────────┬─────────────────────────────────────────────────────────────────┘
            │ API calls
            ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  External API (Google, Slack, databases, etc.)                              │
└─────────────────────────────────────────────────────────────────────────────┘

The CLI acts as a thin wrapper that translates argparse commands into MCP tool calls:
# cli.py - wraps MCP server internally
from mcp import ClientSession

async def cmd_list(args):
    async with ClientSession(server_params) as session:
        result = await session.call_tool("list_documents", {"query": args.query})
        for doc in result["documents"]:
            print(f"{doc['id']}\t{doc['name']}")
Benefits:

Reuse existing MCP servers without rewriting
No context bloat (MCP tools never exposed to Claude)
Still get MCP's auth handling, connection management, etc.
Skill prompt teaches Claude the CLI commands, not the underlying MCP tools

When to use this:

You already have MCP servers you want to leverage
The MCP server handles complex auth/connection logic you don't want to reimplement
You want to keep MCP as internal infrastructure, not user-facing

This pattern treats MCP as an implementation detail rather than a Claude-facing interface.
MCP Tool Search (if building MCP servers)

As of January 2025, Claude Code supports MCP Tool Search - a feature that lazy-loads MCP tools when they'd otherwise consume >10% of context. If you decide to build MCP servers instead of CLI-based skills, keep these best practices in mind:
How Tool Search works:

Claude Code detects when MCP tool descriptions exceed 10% of context
When triggered, tools are loaded via search instead of preloaded
The "server instructions" field tells Claude when to search for your tools

Best practices for MCP servers:

Write good server instructions - This field becomes critical with Tool Search. It helps Claude know when to search for your tools (similar to how skill prompts work).
Keep tool descriptions searchable - Concise but descriptive names and descriptions help Claude find the right tool.
Don't rely on tools always being in context - With Tool Search, your tools may not be visible until Claude searches for them.

Why we still prefer CLI + Skill:

On-demand loading is explicit (user invokes /gsheets)
No dependency on Tool Search heuristics
Easier to debug (you can run CLI commands directly)
Works the same regardless of how many other MCP servers are installed

Tool Search solves the problem of users with 7+ MCP servers consuming 67k+ tokens. Our skill-based approach sidesteps this entirely by only loading context when explicitly invoked.
Implemented Skills


Skill
CLI
Prompt
Status


Google Docs
gdocs-cli
/gdocs
✓ Ready


Google Sheets
gsheets-cli
/gsheets
✓ Ready


Google Slides
gslides-cli
/gslides
✓ Ready


Image Generation
imagen-cli
/imagen
✓ Ready


Gmail
gmail-cli
/email
✓ Ready


Google Calendar
gcal-cli
/calendar
✓ Ready


Remote Session
bash/tmux
/remote-session
✓ Ready (global)


Implemented Task Agents


Agent
Skill Docs
CLI
Status


Data Analyst
.claude/skills/bigquery/
bq query
✓ Ready


Output Guidelines

CLIs should output structured, parseable text:
# Good - Claude can parse this
1AbCdEf2GhI	Project Brief
2BcDeFg3HiJ	Q1 Planning Doc

# Good - clear confirmation
Replaced 3 occurrence(s)

# Bad - too verbose
Successfully completed the replacement operation. The system found and replaced 3 instances of the specified text pattern in the document...

Keep it terse. Claude will summarize for the user.
Aspect	MCP Server	Sub-agent CLI	Dumb CLI + Skill	Task Agent + Skill
Context bloat	Yes (always)	No	No	No
Extra API costs	No	Yes (sub-agent)	No	No
Needs separate API key	No	Yes	No	No
Works in Claude Code	Yes	Yes (Bash)	Yes (/skill)	Yes (Task tool)
Isolated context	No	Yes	Partial*	Partial*
Specialized reasoning	No	Yes	No	Yes
Self-improving docs	No	No	Manual	Yes
Aspect	Sub-agent (own API key)	Task Agent + Skill
API costs	Extra (separate calls)	None (same session)
Context isolation	Complete	Partial (shared tools)
Skill knowledge	Hardcoded in CLI	Reads docs at runtime
Invocation	CLI with own Claude	Task tool spawns agent
Updates	Redeploy CLI	Edit .md files
Skill	CLI	Prompt	Status
Google Docs	`gdocs-cli`	`/gdocs`	✓ Ready
Google Sheets	`gsheets-cli`	`/gsheets`	✓ Ready
Google Slides	`gslides-cli`	`/gslides`	✓ Ready
Image Generation	`imagen-cli`	`/imagen`	✓ Ready
Gmail	`gmail-cli`	`/email`	✓ Ready
Google Calendar	`gcal-cli`	`/calendar`	✓ Ready
Remote Session	bash/tmux	`/remote-session`	✓ Ready (global)