jordotech/skills-repo-split-analysis.md

## skills-repo-split-analysis.md

      
    Raw
  

              skills-repo-split-analysis.md
            
          
    Should skills/ Live in a Separate Repo?

TL;DR: The skills/ directory is 3.6MB / 282 files — smaller than a single retina screenshot. Splitting it into a separate repo adds real operational complexity to solve a non-problem, and partially reintroduces the version-coupling failures that caused the outages we're trying to fix.

Actual Size of skills/

Total:  3.6 MB, 282 files

By type:
  145 .js files   (bundled JS libs in pptx skill)
   78 .xsd files  (Office XML schemas in pptx skill)
   28 .py files   (skill scripts)
   23 .md files   (SKILL.md + references)
    5 .xml files
    2 .html files
    1 .json file

For perspective:


Thing
Size


skills/ directory
3.6 MB


Typical node_modules/
200–500 MB


One retina screenshot PNG
4–8 MB


Anthropic Python SDK
~12 MB


A single Alembic migration
~10 KB


Why We Put Skills in the App Repo

The whole point of this effort is to couple skill versions to code versions. The outages happened because:

Someone uploads a skill zip to platform.claude.com
Someone else updates an env var in k8s
A third person deploys a new image tag
These three actions are not atomic — any mismatch = broken document generation

The fix: skills live next to the code that uses them, versioned together, deployed together.

  
      flowchart LR
    subgraph "Same Repo (current plan)"
        Code["App Code"] --- Skills["skills/"]
        Code --> Image["Docker Image"]
        Skills --> Deploy["deploy-skills CLI"]
        Deploy --> Pin["VERSION env var"]
        Pin --> Image
    end

    style Code fill:#9f9,stroke:#0a0
    style Skills fill:#9f9,stroke:#0a0

    
      Loading

  
A separate repo breaks this coupling and requires a synchronization mechanism to re-establish it:

  
      flowchart LR
    subgraph "Separate Repos"
        Code["App Repo"] -. "must reference" .-> SkillsRepo["Skills Repo"]
        SkillsRepo --> Deploy["deploy-skills"]
        Deploy --> Pin["VERSION env var"]
        Code --> Image["Docker Image"]
        Pin --> Image
    end

    subgraph "Sync Mechanism Needed"
        Sub["Option A: git submodule"]
        File["Option B: .skills-version file"]
        Pkg["Option C: pip package"]
    end

    SkillsRepo --> Sub
    SkillsRepo --> File
    SkillsRepo --> Pkg

    style Code fill:#ffc,stroke:#cc0
    style SkillsRepo fill:#fcc,stroke:#c00

    
      Loading

  
If We Must Split: Options Ranked

Option 1: Git Submodule (least bad)

agentic-backend/
├── skills/  → submodule @ pinned commit SHA
└── src/

How it works:

skills/ is a git submodule pointing to the skills repo at a specific commit
The commit SHA is tracked in the parent repo — so git log shows exactly which skill version each app commit uses
Updating skills = update submodule pointer + commit to app repo

Pros:

Version coupling is preserved (submodule SHA is in app repo history)
git diff shows when skills changed between releases
CI can pin exact commit

Cons:

Submodules are universally disliked by developers
Every git clone needs --recurse-submodules (or a post-clone step)
PRs that touch both repos require two PRs coordinated manually
New developers will forget to init submodules and wonder why skills are empty
CI/CD pipelines all need updating to handle submodules

Option 2: .skills-version File (lighter, weaker coupling)

A file in the app repo pins a tag/SHA from the skills repo:
# .skills-version
skills-repo-ref=v3.4.2

CI fetches at build time:
git clone --depth 1 --branch $(cat .skills-version) \
  git@github.com:org/cap-skills.git skills/
Pros:

Simple to understand
No submodule complexity

Cons:

We're back to a version string that must be manually updated — a lighter version of the original env var problem
Easy to forget to bump the ref when skills change
Local development requires a manual fetch step

Option 3: Pip Package (over-engineered)

Skills repo publishes cap-skills as a pip package. App pins it:
dependencies = ["cap-skills==3.4.2"]
Pros:

Standard dependency management
Version pinned in lockfile

Cons:

Massive over-engineering for 3.6MB of static files
Publish pipeline needed for the skills repo
Import path gymnastics to find skill files at runtime


Decision Matrix


Factor
Same Repo
Submodule
.skills-version
Pip Package


Version coupling
Automatic
Strong (SHA tracked)
Manual (must bump)
Strong (lockfile)


Developer experience
Normal git workflow
Painful (submodule footguns)
Extra fetch step
Normal pip workflow


CI complexity
None
Moderate (recurse-submodules)
Moderate (clone step)
Low


Cross-repo PRs
N/A
2 PRs needed
2 PRs needed
2 PRs needed


Risk of desync
None
Low (but possible)
Medium
Low


Setup for new devs
Zero
Must remember --recurse
Must run fetch script
Just pip install


Blast radius of mistakes
Low
Medium
Medium
Low


Added complexity
None
Significant
Moderate
Significant


Repo size impact
+3.6 MB
None
None
None


Recommendation

Keep skills/ in the app repo. 3.6MB is not bloat — it's a rounding error. The operational cost of any split option is far higher than the disk space saved.
If the concern is really about ownership (skills team doesn't want their changes gated by backend PR review), address that with CODEOWNERS:
# .github/CODEOWNERS
/skills/          @org/skills-team
/src/             @org/backend-team

This gives the skills team autonomy over their directory while keeping everything in one repo with one version history.

Questions for the Team


Is the concern disk space, or ownership/autonomy? These have different solutions.
If we split, who owns the sync mechanism? Submodule bumps, .skills-version updates — someone has to do this for every release.
Are we comfortable with cross-repo PRs? Any skill change that requires a code change (new setting, new config wiring) becomes a coordinated two-repo dance.
Thing	Size
`skills/` directory	3.6 MB
Typical `node_modules/`	200–500 MB
One retina screenshot PNG	4–8 MB
Anthropic Python SDK	~12 MB
A single Alembic migration	~10 KB
Factor	Same Repo	Submodule	.skills-version	Pip Package
Version coupling	Automatic	Strong (SHA tracked)	Manual (must bump)	Strong (lockfile)
Developer experience	Normal git workflow	Painful (submodule footguns)	Extra fetch step	Normal pip workflow
CI complexity	None	Moderate (recurse-submodules)	Moderate (clone step)	Low
Cross-repo PRs	N/A	2 PRs needed	2 PRs needed	2 PRs needed
Risk of desync	None	Low (but possible)	Medium	Low
Setup for new devs	Zero	Must remember --recurse	Must run fetch script	Just pip install
Blast radius of mistakes	Low	Medium	Medium	Low
Added complexity	None	Significant	Moderate	Significant
Repo size impact	+3.6 MB	None	None	None