Skip to content

Instantly share code, notes, and snippets.

@jwmatthews
Created March 2, 2026 18:38
Show Gist options
  • Select an option

  • Save jwmatthews/215d3f0cb30bc19290cc8349dbe88105 to your computer and use it in GitHub Desktop.

Select an option

Save jwmatthews/215d3f0cb30bc19290cc8349dbe88105 to your computer and use it in GitHub Desktop.

Camel-Kit Deep Dive for Onboarding Engineers

What This Product Actually Is

Camel-Kit is not a conventional "AI application" where a backend calls an LLM API and orchestrates tool invocations itself. It is a prompt-packaging and workflow bootstrap tool for Apache Camel engineering.

Its job is to:

  1. Initialize a new Camel integration workspace.
  2. Install assistant-specific slash commands and workflow skills into that workspace.
  3. Configure Apache Camel's MCP server for live catalog lookup and validation.
  4. Guide an AI coding assistant through a structured, artifact-driven delivery flow:
    • Business Requirements Document
    • Technical Design Document
    • Camel YAML route
    • Validation report
    • Citrus integration tests

The core product idea is:

  • Keep the Java CLI thin.
  • Put the domain intelligence into versioned SKILL.md assets.
  • Use MCP for authoritative, version-specific Camel knowledge.
  • Force the agent to generate intermediate artifacts before code.

For onboarding purposes, the most important mental model is:

The repo's primary runtime behavior lives in prompt assets and file conventions, not in Java business logic.


High-Level Repo Structure

Root modules

  • camel-kit-core/
    • The real product core.
    • Contains the Picocli CLI, init workflow, TUI, downloaders, templates, and all packaged AI skills.
  • camel-kit-main/
    • JBang entry point.
    • Wraps camel-kit-core so users can install and run Camel-Kit as a JBang app.
  • camel-jbang-plugin-kit/
    • Adapter layer for camel kit init inside Camel JBang.
    • Delegates back to camel-kit-core.
  • camel-kit-plugins/
    • Currently an empty Maven aggregator with no child modules.

Non-code content

  • docs/
    • Human documentation for users and contributors.
  • examples/
    • Example workflow, mainly explanatory.
  • website/
    • Separate Hugo site for published docs and marketing pages.

Important implication

This is a content-heavy repo:

  • Java source: about 2.4k lines
  • Packaged workflow skills: about 5.9k lines
  • Resources/templates/docs are a large part of the product surface

That distribution is a strong signal about where enhancements usually belong.


What the Codebase Does at Runtime

Runtime surfaces

Camel-Kit has two user-facing execution modes:

  1. Standalone CLI via JBang
  2. Camel JBang plugin via camel kit init

In both cases, the only real Java command implemented today is init.

Actual entry points

What init does

init is the product's bootstrapper. The implementation is in:

The command performs a fixed sequence:

  1. Validate target AI assistant from AgentRegistry
  2. Resolve target project directory
  3. Create .camel-kit working structure
  4. Write project config and constitution
  5. Install slash command wrappers for the chosen assistant
  6. Copy bundled workflow skills into the assistant's skills/ folder
  7. Create Maven wrapper files
  8. Create assistant-specific MCP configuration
  9. Optionally download Citrus schemas and generate a quick reference
  10. Print next-step instructions

This is the key architectural boundary:

  • Java code bootstraps the workspace.
  • The AI assistant then carries the workflow forward by reading installed SKILL.md files.

Module-by-Module Architecture

camel-kit-core

This module owns nearly all product behavior.

CLI shell and command wiring

  • CamelKitMain.java
    • Creates JLine terminal/printer
    • Prints banner/logo
    • Registers Picocli subcommands
    • Exposes default Camel and Citrus versions
  • CamelKitCommand.java
    • Thin base class for commands

Bootstrap logic

  • InitCommand.java
    • The main operational command
    • Owns workspace creation, template generation, resource copying, MCP config writing, and optional schema fetch

Agent abstraction

This is intentionally simple:

  • bob -> .bob/commands
  • gemini -> .gemini/commands
  • claude -> .claude/commands

The registry abstracts:

  • command folder
  • file format (md vs toml)
  • assistant label

This is the seam you would extend for a new AI assistant.

Output and UX

This layer does not change product semantics. It improves the perceived quality of init:

  • native-image-aware terminal rendering
  • split-screen progress UI when supported
  • graceful fallback to text/banner mode

Catalog and schema utilities

Important nuance:

  • CitrusSchemaDownloader is used by init.
  • CatalogDownloader exists, but the current init flow does not use it.
  • The preferred runtime architecture in skills is to query Camel live via MCP rather than rely on bundled static catalog files.

camel-kit-main

This module is packaging, not product logic.

  • It provides the JBang script wrapper.
  • Maven copies the JBang source into dist/.
  • The entry point just forwards into camel-kit-core.

If you are changing behavior, this module is usually not where the work belongs.

camel-jbang-plugin-kit

This is an adapter so Camel-Kit can appear as a Camel JBang plugin.

Key point:

  • It does not reimplement features.
  • It maps Camel JBang command parameters to the same InitCommand used by standalone mode.

That is a good design choice: one init implementation, multiple front doors.

camel-kit-plugins

Currently a placeholder aggregator. No active plugins live here.

Treat it as future expansion space, not current architecture.


The Real Product Architecture: Artifact-Driven AI Workflow

Camel-Kit uses a staged artifact model.

Stage 0: Bootstrap

camel-kit init creates:

  • assistant command files
  • assistant skill files
  • .camel-kit/config.yaml
  • .camel-kit/constitution.md
  • .camel-kit/templates/*
  • MCP config for the selected assistant
  • Maven wrapper
  • schemas/ and test/data/
  • optional Citrus schema cache

Stage 1: Business requirements

Installed slash command:

  • camel-project

Primary output:

  • .camel-kit/business-requirements.md

This captures business intent and integration landscape before any implementation details.

Stage 2: Flow design

Installed slash command:

  • camel-flow

Primary output:

  • .camel-kit/flows/{flow-name}/{flow-name}.tdd.md

This is the main design artifact. It encodes:

  • source system
  • sink system
  • processing steps
  • transformations
  • dependencies
  • error handling
  • test scenarios

Stage 3: Migration path

Installed slash commands:

  • camel-migrate
  • internal camel-migrate-mule

Outputs:

  • .camel-kit/business-requirements.md
  • .camel-kit/flows/{flow-name}/{flow-name}.tdd.md

This path converges on the same artifacts as the greenfield path. That is a strong design choice because implementation, validation, and testing stay unchanged downstream.

Stage 4: Implementation

Installed slash command:

  • camel-implement

Expected outputs in project root:

  • {flow-name}.camel.yaml
  • application.properties
  • docker-compose.yaml
  • run.sh
  • DataMapper artifacts when applicable

Stage 5: Validation

Installed slash command:

  • camel-validate

Expected outputs:

  • validation findings
  • optionally corrected YAML
  • a validation report file per skill instructions

Stage 6: Test generation

Installed slash command:

  • camel-test

Expected outputs:

  • test/{flow-name}.camel.it.yaml
  • test/application-test.properties
  • run-tests.sh
  • test data files

Architectural consequence

The repo is built around a file-mediated workflow:

  • Each step reads prior artifacts.
  • Each step produces a more concrete artifact.
  • The LLM is expected to operate with those files as shared memory and handoff state.

This is the repo's single most important architectural pattern.


How the AI Workflow Is Encoded

The main prompt assets live under:

Current workflow skills in the repo:

  • camel-project
  • camel-flow
  • camel-implement
  • camel-validate
  • camel-test
  • camel-migrate
  • camel-migrate-mule
  • shared datamapper-canonicalize.md

The command files generated by init are intentionally tiny. They mostly say:

Read <assistant>/skills/<skill>/SKILL.md and follow those instructions.

That means the installed slash commands are really just dispatch shims into packaged workflow prompts.


Specific Generative AI and Agentic Patterns Used

1. Prompt-as-product

The product's core intelligence is not hardcoded in Java classes. It is stored as versioned prompt artifacts:

  • SKILL.md
  • prompt guides
  • constitution template
  • YAML generation guide
  • validation guide

This makes the repo feel closer to:

  • a compiler toolchain for AI workflows
  • a prompt operating system
  • a spec-driven assistant kit

than a typical application backend.

2. Role-based sub-agents

Each skill assigns the model a narrow working identity:

  • Business Analyst
  • Integration Architect
  • Developer/Implementer
  • Quality Assurance Engineer
  • Test Engineer
  • Migration Specialist
  • Data Mapping Specialist

That is a classic agentic decomposition pattern: constrain the LLM with a role, a goal, allowed inputs, required outputs, and explicit stop conditions.

3. Artifact-gated progression

Every major skill checks for prerequisite files before it proceeds.

Examples:

  • camel-flow requires business requirements and config
  • camel-implement requires BRD and TDD
  • camel-test requires TDD, implementation, and Citrus reference

This reduces open-ended reasoning. The agent is not asked to improvise the whole system at once; it is forced through explicit gates.

4. Structured interviews instead of open prompting

The skills repeatedly enforce:

  • ask one question at a time
  • wait for user response
  • only ask conditional questions when relevant
  • avoid re-asking already known facts

This is a deliberate anti-chaos pattern. It narrows the search space and improves consistency.

5. Externalized memory through files

The system uses files as durable working memory:

  • BRD = business memory
  • TDD = technical memory
  • constitution = policy memory
  • config.yaml = runtime/version memory
  • generated YAML/test files = implementation memory

This is important because it avoids relying on transient chat context for long-running flows.

6. Hybrid knowledge strategy: static prompts + live MCP

The skills consistently instruct the assistant to prefer MCP tool calls for authoritative, version-specific answers:

  • camel_catalog_component_doc
  • camel_catalog_components
  • camel_catalog_eip_doc
  • camel_catalog_dataformat_doc
  • camel_catalog_language_doc
  • camel_validate_route
  • camel_route_context
  • camel_route_harden_context

This is a strong retrieval architecture:

  • Static skills provide process and guardrails
  • MCP provides current, versioned truth

That sharply reduces hallucination risk in a domain where option names and YAML structure are version-sensitive.

7. "Never trust model memory" as a first-class rule

This pattern shows up all over the skills:

  • do not suggest components before querying the catalog
  • do not use training data for option names
  • do not assume expression language names
  • verify all component property names against catalog docs

This is one of the repo's best design decisions. It treats the model as a planner/generator, not as an authoritative source of framework truth.

8. Deterministic fallback paths

The skills usually define:

  1. primary path via MCP
  2. fallback via bundled skills or static guides
  3. user escalation if neither is available

That is a robust agent pattern because tool failure does not automatically collapse the workflow.

9. Progressive disclosure

The skills are designed to load more guidance only when the context demands it.

Examples:

  • load performance.md only if throughput/latency matters
  • load security.md only if compliance/security matters
  • load monitoring.md only if observability matters
  • load DataMapper guides only for relevant format pairs

This controls token usage and keeps the assistant focused.

10. Prompt-enforced validation loops

camel-implement does not stop at generation. It instructs the assistant to:

  • generate YAML
  • validate it
  • fix errors
  • re-query official docs when needed
  • retry until valid

That is an agentic "generate -> verify -> repair" loop embedded directly in the prompt architecture.

11. Canonicalization before generation

The DataMapper flow is the clearest example.

Instead of asking the model to jump straight from semantic mapping to XSLT, Camel-Kit inserts an intermediate canonicalization step:

  • collect semantic mappings
  • compute canonical Source XPath values
  • compute canonical Target Element values
  • store them in the TDD
  • then generate XSLT from that canonical form

This is a high-quality agent pattern:

  • convert fuzzy user intent into a stable intermediate representation
  • generate code from the representation, not directly from prose

That is essentially IR-driven code generation for prompt systems.

12. Specialized sub-workflows for difficult transformations

The DataMapper path is a mini pipeline:

  1. datamapper-interview.md or datamapper-migrate.md
  2. datamapper-canonicalize.md
  3. datamapper-implement.md

This breaks a hard problem into manageable phases:

  • elicitation
  • normalization
  • code generation
  • self-validation

That is more sophisticated than typical prompt kits and is one of the strongest agentic design patterns in the repo.

13. Convergent workflow design

Greenfield and migration both converge on the same BRD/TDD artifacts.

That means:

  • fewer downstream branches
  • shared implementation logic
  • shared validation logic
  • shared testing logic

This is not just good product design. It is good agent design, because it limits prompt divergence.


DataMapper: The Most Specialized Agentic Subsystem

The DataMapper subsystem is the most engineered prompt workflow in the repo.

Relevant files:

Why it matters

Data transformation is where LLMs tend to become unreliable:

  • path semantics drift
  • type assumptions drift
  • JSON/XML conversions are error-prone
  • generated XSLT is easy to get subtly wrong

Camel-Kit counters that by forcing structure.

Pattern used

  1. Gather schema or schema-like field information
  2. Infer semantic mappings
  3. Confirm mappings with the user
  4. Canonicalize into machine-usable structural fields
  5. Write canonical mapping section into TDD
  6. Generate XSLT from canonical data
  7. Self-validate generated XSLT against the TDD
  8. Inject YAML step and .kaoto metadata

Engineering takeaway

If you need to enhance any transformation-heavy feature, follow this same pattern:

  • do not generate final code directly from conversational requirements
  • first create a constrained, explicit intermediate representation

Packaging and Distribution Architecture

Maven

The root pom.xml is a standard multi-module aggregator:

  • camel-kit-main
  • camel-kit-core
  • camel-kit-plugins
  • camel-jbang-plugin-kit

JBang

The JBang alias is defined in:

It points to:

  • camel-kit-main/src/main/jbang/main/CamelKit.java

Camel JBang plugin

The plugin module depends on:

  • camel-jbang-core with provided scope
  • camel-kit-core

That keeps feature logic centralized while exposing it inside Camel's CLI ecosystem.


Where To Make Changes

If you want to change the user workflow

Edit the skill files first:

That is usually more important than changing Java.

If you want to add a new AI assistant

Start here:

You will need to define:

  • assistant command folder
  • file format for command wrappers
  • MCP config file shape/location
  • expected command invocation convention

If you want to add a new migration vendor

Pattern to follow:

  1. extend camel-migrate/SKILL.md
  2. create a new internal vendor sub-skill
  3. add vendor-specific mapping guides
  4. keep outputs identical to BRD/TDD produced by greenfield flow

That last rule is critical. The downstream pipeline depends on convergence.

If you want to harden generated code quality

Primary hotspots:

The quality system is prompt-enforced more than code-enforced.

If you want to improve bootstrap behavior

Primary hotspots:


What Is Strong About the Current Design

  1. Clear separation between bootstrap code and AI workflow content.
  2. Strong bias toward version-aware, authoritative MCP lookup.
  3. Good use of intermediate artifacts to reduce prompt ambiguity.
  4. Convergent greenfield/migration architecture.
  5. DataMapper pipeline shows disciplined prompt engineering, not ad hoc prompting.
  6. Thin Java surface area makes the system easy to reason about operationally.

Current Gaps and Repo Realities

These are important for a new engineer because the repo contains some documentation drift.

1. Docs describe more bundled component skills than the repo currently contains

The architecture docs describe hundreds of pre-generated component skills, but the checked-in repo currently contains:

  • seven workflow SKILL.md files
  • one shared DataMapper guide
  • guide documents under those workflow folders

I did not find any bundled camel-component-* skill directories in the current checkout.

That means one of these is true:

  • the docs are ahead of the repo
  • the component skills are generated elsewhere and not checked in
  • the architecture changed and the docs were not fully updated

Treat the current codebase as the source of truth unless maintainers clarify otherwise.

2. The CLI surface is narrower than the documentation implies

Today, the Java CLI primarily implements init.

Commands like camel-flow and camel-implement are not Java subcommands. They are installed assistant commands that forward into packaged prompts.

That distinction matters when debugging "why a command behaved this way."

3. Gemini MCP docs and implementation appear inconsistent

The docs commonly refer to .gemini/mcp.json, but InitCommand writes Gemini config to:

  • .gemini/settings.json

That should be reviewed and normalized.

4. Some contributor docs are outdated

CONTRIBUTING.md still references paths and modules that do not match the current repo state, including:

  • old package paths
  • non-existent plugin/module structure
  • non-existent Python utilities

Do not treat it as fully current without verification.

5. There are no automated tests in the repo right now

I found no src/test files in the current checkout.

Quality is currently enforced through:

  • build success
  • prompt constraints
  • generated validation steps
  • runtime validation instructions

That is workable, but it raises the importance of careful regression checking whenever skill text changes.


Recommended Reading Order for a New Engineer

  1. README.md
  2. camel-kit-core/src/main/java/io/github/luigidemasi/camelkit/command/InitCommand.java
  3. camel-kit-core/src/main/resources/skills/camel-flow/SKILL.md
  4. camel-kit-core/src/main/resources/skills/camel-implement/SKILL.md
  5. camel-kit-core/src/main/resources/skills/camel-validate/SKILL.md
  6. The DataMapper guides
  7. camel-jbang-plugin-kit/src/main/java/io/github/luigidemasi/camelkit/jbang/KitInitCommand.java

That order moves from:

  • product concept
  • to bootstrap implementation
  • to prompt architecture
  • to specialized generation logic
  • to packaging adapters

Practical Advice for Enhancing the Product

  1. Treat skill text changes like code changes. Small wording edits can materially change behavior.
  2. Preserve the staged artifact model. It is the main defense against prompt drift.
  3. Prefer new intermediate representations over larger prompts when adding complexity.
  4. Keep using MCP as the truth source for version-sensitive Camel knowledge.
  5. Tighten documentation drift early. In this repo, stale docs can mislead contributors faster than stale code.
  6. Add regression fixtures if you extend the workflow. Even simple golden-file examples would materially improve confidence.

Build Status During This Review

I verified the current checkout builds successfully with:

./mvnw -q -DskipTests package

That confirms the multi-module build is healthy at a packaging level, but it does not compensate for the current lack of automated test coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment