Skip to content

Instantly share code, notes, and snippets.

@jordotech
Created March 7, 2026 15:54
Show Gist options
  • Select an option

  • Save jordotech/0ce0f3e1cd3b9e19d8978fad7fbc2c56 to your computer and use it in GitHub Desktop.

Select an option

Save jordotech/0ce0f3e1cd3b9e19d8978fad7fbc2c56 to your computer and use it in GitHub Desktop.
RFC: Git-Tracked Claude Skill Management — replace manual platform.claude.com uploads with CLI-driven, version-pinned deployments

RFC: Git-Tracked Claude Skill Management

Team: Agentic Backend Author: Engineering Date: 2026-03-07 Status: Implementation planned — [plan at docs/plans/2026-03-07-cli-skill-management.md]


Problem

We currently manage 7 custom Claude skills by manually uploading zip files to platform.claude.com, then updating environment variables (PPT_SKILL_ID, etc.) in Pydantic settings or k8s config. This causes:

  • No git tracking — skill content is opaque; nobody can read, diff, or review it
  • Outages on ID mismatch — wrong env var = broken skill = broken document generation
  • Version coupling failures — different k8s clusters run different image tags, but all share "version": "latest", so a skill update for one cluster breaks another
  • No admin recoverability — if a skill breaks, you can't inspect or fix it in platform.claude.com

Affected Skills

Env Var Service Skill Type
PPT_SKILL_ID PowerPoint generation custom
EY_PPT_SKILL_ID EY-branded PowerPoint custom
XLSX_SKILL_ID Spreadsheet generation custom
DOCX_SKILL_ID Word document generation custom
ORG_CHART_SKILL_ID Org chart / network graph custom
DATA_ANALYSIS_SKILL_ID Data analysis custom
D3_DATA_VIZ_SKILL_ID D3 visualizations custom

Solution

Store skill source files in git and deploy them via the Anthropic Skills API (POST /v1/skills/{id}/versions), pinning versions per build.

Key Discovery

Anthropic exposes a full Skills CRUD API in the Python SDK:

# Upload new version
client.beta.skills.versions.create(skill_id=..., files=[...], betas=["skills-2025-10-02"])

# List versions
client.beta.skills.versions.list(skill_id=..., betas=["skills-2025-10-02"])

This means we can automate everything — no more manual uploads.


Architecture

Before (manual, error-prone)

flowchart LR
    Engineer["Engineer"]
    Platform["platform.claude.com"]
    EnvVar["K8s Env Vars"]
    App["Agentic Backend"]

    Engineer -->|"1. manually upload zip"| Platform
    Platform -->|"2. returns skill_id"| Engineer
    Engineer -->|"3. manually update env var"| EnvVar
    EnvVar -->|"4. app reads skill_id"| App
    App -->|"5. uses skill_id + version=latest"| Platform

    style Engineer fill:#f99,stroke:#c00
    style Platform fill:#fcc,stroke:#c00
    style EnvVar fill:#fcc,stroke:#c00
Loading

After (git-tracked, automated)

flowchart LR
    Git["Git Repo<br/>skills/ directory"]
    CI["CI/CD Pipeline"]
    API["Anthropic Skills API"]
    K8s["K8s ConfigMap"]
    App["Agentic Backend"]

    Git -->|"1. push to branch"| CI
    CI -->|"2. deploy-skills CLI"| API
    API -->|"3. returns version ID"| CI
    CI -->|"4. inject VERSION env var"| K8s
    K8s -->|"5. app reads pinned version"| App
    App -->|"6. uses skill_id + version=pinned"| API

    style Git fill:#9f9,stroke:#0a0
    style CI fill:#9f9,stroke:#0a0
    style API fill:#ccf,stroke:#00c
Loading

Version Isolation Across Clusters

flowchart TB
    subgraph "Git Tags"
        v320["v3.2.0<br/>skills/ @ commit abc"]
        v342["v3.4.2<br/>skills/ @ commit xyz"]
    end

    subgraph "Anthropic Skills API"
        sv1["Skill Version<br/>1738000000"]
        sv2["Skill Version<br/>1741318800"]
    end

    subgraph "K8s Clusters"
        A["Cluster A<br/>image: v3.2.0<br/>PPT_SKILL_VERSION=1738000000"]
        B["Cluster B<br/>image: v3.4.2<br/>PPT_SKILL_VERSION=1741318800"]
    end

    v320 --> sv1
    v342 --> sv2
    sv1 --> A
    sv2 --> B

    style A fill:#ffe,stroke:#cc0
    style B fill:#efe,stroke:#0a0
Loading

Repo Structure Change

agentic-backend/
├── skills/                          # NEW — git-tracked skill source
│   ├── README.md
│   ├── pptx/
│   │   ├── SKILL.md                 # Required entry point
│   │   ├── scripts/                 # Optional executable code
│   │   └── assets/                  # Optional templates
│   ├── ey-pptx/
│   │   └── SKILL.md
│   ├── xlsx/
│   │   └── SKILL.md
│   ├── docx/
│   │   └── SKILL.md
│   ├── org-chart/
│   │   └── SKILL.md
│   ├── data-analysis/
│   │   └── SKILL.md
│   └── d3-data-viz/
│       └── SKILL.md
├── src/
│   ├── core/
│   │   └── settings.py              # MODIFIED — adds *_SKILL_VERSION fields
│   ├── cli.py                       # MODIFIED — adds deploy-skills, list-skill-versions
│   └── services/
│       └── claude_skills/
│           ├── registry.py          # NEW — skill name -> settings mapping
│           └── config.py            # UNCHANGED — already has skill_version field

New CLI Commands

deploy-skills — Upload skill files and get version IDs

# Deploy all skills
just deploy-skills

# Deploy one skill
just deploy-skill pptx

# Dry run (validate only)
just deploy-skills --dry-run

Output:

Deploying pptx...
Deploying xlsx...
...

┌─────────────┬──────────────────────────────┬─────────────┬──────────────────────────────────┐
│ Skill       │ Skill ID                     │ New Version │ Env Var                          │
├─────────────┼──────────────────────────────┼─────────────┼──────────────────────────────────┤
│ pptx        │ skill_01PU1cCLSDx4Ao8g3ke9m… │ 1741318800  │ PPT_SKILL_VERSION=1741318800     │
│ xlsx        │ skill_014vWE9w1c9CVRmoie7np… │ 1741318801  │ XLSX_SKILL_VERSION=1741318801    │
│ ...         │ ...                          │ ...         │ ...                              │
└─────────────┴──────────────────────────────┴─────────────┴──────────────────────────────────┘

Set these version values as env vars in your k8s config to pin this deployment.

list-skill-versions — See deployed versions

just skill-versions pptx

Migration Flow

flowchart TD
    A["1. Extract content from existing<br/>skill zips on platform.claude.com"] --> B["2. Place files in skills/<name>/<br/>with SKILL.md at root"]
    B --> C["3. Run: just deploy-skills<br/>(uploads via API, prints versions)"]
    C --> D["4. Set *_SKILL_VERSION env vars<br/>in k8s configmaps per cluster"]
    D --> E["5. Deploy — app uses<br/>pinned versions instead of 'latest'"]
    E --> F["6. Remove manual upload<br/>workflow from team process"]

    style A fill:#fcc
    style F fill:#9f9
Loading

What Doesn't Change

  • Skill IDs are stablePPT_SKILL_ID etc. don't change, they're permanent identifiers
  • Default is backward-compatible*_SKILL_VERSION defaults to "latest", so existing deployments keep working
  • No API contract changes — the Anthropic container config just gets a pinned version string instead of "latest"

Action Items

# Action Owner Status
1 Review this RFC and the implementation plan Team Pending
2 Extract existing skill zip contents into skills/ directories Skill owners Not started
3 Implement the 10 tasks in the plan (settings, wiring, registry, CLI, tests) Assignee TBD Not started
4 Add deploy-skills step to CI/CD pipeline DevOps Not started
5 Update k8s manifests to include *_SKILL_VERSION env vars DevOps Not started
6 Deprecate manual platform.claude.com upload process Team After rollout

Questions / Discussion

  • Who has the existing skill zips? We need to extract their contents into the skills/ directories.
  • CI/CD integration timing — should we add the deploy step to the existing release pipeline or keep it manual via just deploy-skills initially?
  • Rollout strategy — suggest starting with one skill (e.g., pptx) to validate, then rolling out to all 7.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment