marccampbell/helm-reference-design.md

## helm-reference-design.md

      
    Raw
  

              helm-reference-design.md
            
          
    Helm Chart Reference Docs — Design

Overview

When a vendor promotes a release, we generate structured Helm chart reference documentation and serve it through Enterprise Portal. Vendors declare which charts they want documented in toc.yaml. The worker extracts chart content from the release, parses values.yaml deterministically, enriches with Claude, and stores structured JSON. The EP frontend renders it using the same component pattern as Terraform module references.
toc.yaml Declaration

- title: Reference
  icon: file-text
  items:
    - title: Helm Reference
      helm_chart:
        name: chartsmith
        version: "1.0.0"   # optional — see Chart Resolution below
      visible_when:
        entitlements:
          - isHelmInstallEnabled
Fields


Field
Required
Description


helm_chart.name
Yes
Chart name as it appears in Chart.yaml


helm_chart.version
No
Specific chart version. If omitted, the release must contain exactly one version of this chart name.


Trigger

Event: Release promote
Queue: replicated-helm-reference-generate-{env} (existing SQS queue)
Message type: helm-chart-reference (new, alongside existing content_repo_synced and terraform_module_llm)
Queue Message

{
  "type": "helm-chart-reference",
  "app_id": "...",
  "channel_id": "...",
  "channel_sequence": 123
}
Enqueued from the release promote handler (release_promote.go), gated behind the ep_customization_terraform feature flag (or a new ep_helm_reference flag — TBD).
One message per channel that receives the promote. If a release is promoted to 3 channels simultaneously, 3 messages are enqueued.
Worker Flow (ep-reference-generate)

Step 1: Resolve toc.yaml


Look up the content repo for the app (ep_content_repo by app_id)
If no content repo exists, skip (vendor hasn't configured EP customization)
Parse the cached toc.yaml (from ep_content_toc)
Extract all helm_chart: entries
If no helm_chart: entries, skip

Step 2: Chart Resolution

For each helm_chart: entry in toc.yaml:

Query kots_channel_release_chart by (channel_id, channel_sequence, chart_name)
If version specified → filter by chart_version too
0 rows → error: Chart '{name}' not found in release on this channel
1 row → use it
>1 rows and no version specified → error: Multiple versions of chart '{name}' found in release. Specify a version in toc.yaml to disambiguate.

Step 3: Get Release Sequence


Look up kots_channel_release by (channel_id, channel_sequence) → get release_sequence
This maps the channel-specific sequence to the app-wide release sequence needed to fetch spec files

Step 4: Extract Chart Content


Call spec.GetSpecFiles(ctx, db, appID, releaseSequence) to get the compressed release spec
Walk spec files to find the matching .tar.gz chart archive (by chart name)
Base64-decode and extract the tarball
Pull out key files:

Chart.yaml — metadata (name, version, description, dependencies)
values.yaml — the primary input for documentation
README.md — additional context (if present)


We intentionally do NOT send templates/ to Claude. The prompt focuses on values documentation, and templates would blow up the token budget without proportional value.
Step 5: Deterministic Parse

Parse values.yaml to extract structure before LLM enrichment:

All key paths (dot-notation: database.type, web.ingress.enabled)
Inferred types from default values (string, boolean, integer, float, array, object)
Default values
Inline comments (YAML comments above or beside a key → raw description)

This gives us a baseline that works even if the LLM call fails.
Step 6: LLM Enrichment (Claude)

Model: claude-sonnet-4-6
Input: values.yaml + Chart.yaml + README.md (if present)
Prompt (adapted from Marc's helm-values-documenter agent):
Core instructions:

Analyze the values.yaml structure and document every configuration option
For each value: description, valid options/ranges, dependencies, security implications, enterprise considerations
Group values into logical sections (Database, Ingress, Auth, Resources, etc.)
Provide deployment scenario examples (production, HA, cloud-specific)
Highlight production-critical and security-sensitive settings
Output as structured JSON matching our schema

Output: Structured JSON (see schema below)
Change detection: Hash values.yaml + Chart.yaml content. Skip LLM call if hash matches the previously generated reference for this chart on this channel.
Step 7: Store

Merge deterministic parse + LLM output into HelmChartReference JSON. Store on the version record.
Data Model

ep_helm_reference (new table)


Column
Type
Description


id
varchar(27)
KSUID primary key


app_id
varchar(255)
FK to app


chart_name
varchar(255)
Chart name from toc.yaml


channel_id
varchar(255)
Channel this reference is for


chart_version
varchar(255)
Resolved chart version


release_sequence
bigint
Source release


reference_json
mediumtext
Structured JSON (see schema)


content_hash
varchar(64)
SHA-256 of values.yaml + Chart.yaml (change detection)


last_generated_at
datetime
When reference was last generated


last_generation_error
text
Error message if generation failed


created_at
datetime


updated_at
datetime


Unique index: (app_id, chart_name, channel_id)
This is simpler than the TF module pattern (no separate version table). Each channel gets one reference per chart, updated on each promote. We don't need version history — the reference always reflects the latest promoted release.
Why no separate version table?

For TF modules, versions are meaningful (tagged releases, multiple versions coexist). For Helm references, the customer only ever sees the docs for their current channel's release. Previous versions are irrelevant once a new release is promoted. One row per (app, chart, channel) is sufficient.
Structured JSON Schema

{
  "schema_version": 1,
  "chart_name": "chartsmith",
  "chart_version": "1.0.0",
  "app_version": "2.5.0",
  "description": "ChartSmith is a collaborative chart creation platform...",

  "sections": [
    {
      "title": "Database Configuration",
      "description": "Configure the database backend for ChartSmith.",
      "values": [
        {
          "path": "database.type",
          "type": "string",
          "default": "internal",
          "required": false,
          "description": "Controls whether ChartSmith uses an internal PostgreSQL instance or an external database.",
          "valid_options": ["internal", "external"],
          "security_note": "For production, use 'external' with a managed database service.",
          "dependencies": ["If 'external', database.host and database.credentials must be set."],
          "examples": [
            {
              "scenario": "External PostgreSQL on AWS RDS",
              "yaml": "database:\n  type: external\n  host: mydb.cluster-xxx.us-east-1.rds.amazonaws.com\n  port: 5432"
            }
          ]
        }
      ]
    }
  ],

  "deployment_scenarios": [
    {
      "title": "AWS Production Deployment",
      "description": "Recommended configuration for production on AWS with RDS, S3, and ALB ingress.",
      "yaml": "# Full values override example\ndatabase:\n  type: external\n  ..."
    }
  ],

  "notes": [
    {
      "severity": "warning",
      "title": "TLS Required in Production",
      "content": "Running without TLS is only appropriate for development environments."
    }
  ]
}
Section Hierarchy

Values are grouped into sections by the LLM based on functional area. Common sections:

Global Settings
Database / Storage
Ingress / Networking
Authentication / Security
Resources / Scaling
Monitoring / Observability
Feature Flags

The deterministic parse provides the raw list of all paths+defaults. The LLM groups them, writes descriptions, and adds the enterprise context.
Market-API Endpoints

GET /v3/ep/helm/references

Returns all helm chart references available to this customer (based on their channel).
{
  "references": [
    {
      "chart_name": "chartsmith",
      "chart_version": "1.0.0",
      "last_generated_at": "2026-02-23T12:00:00Z"
    }
  ]
}
Resolution: Customer's license → channel_id → latest ep_helm_reference for that channel.
GET /v3/ep/helm/references/:chartName

Returns the full reference JSON for a specific chart.
{
  "found": true,
  "chart_name": "chartsmith",
  "chart_version": "1.0.0",
  "reference": { ... }
}
EP Frontend

toc.yaml Routing

helm_chart: entries in toc.yaml map to /content/helm/{chartName} via itemToHref in custom-navigation.tsx (same pattern as terraform_module → /content/infrastructure/{moduleName}).
Component: HelmChartReferenceView

New React component in enterprise-portal/templates/docs/app/content/[...slug]/helm-chart-reference.tsx.
Renders the structured JSON:

Header: Chart name, version, description
Sections: Expandable/collapsible groups of values
Per-value: Path (monospace), type badge, default value, description, valid options, security notes, dependency callouts
Deployment scenarios: Tabbed code blocks with full YAML examples
Notes: Callout banners (warning/info)

Reuses the same styling patterns as TerraformModuleReferenceView.
Generating State

Same pattern as TF modules: if ep_helm_reference exists but reference_json is null, show spinner with "Documentation is being generated."
Content-Status API Extension

GET /v3/app/{appId}/enterprise-portal/content-status already returns TF module status. Extend it to include Helm references:
{
  "helm_references": [
    {
      "chart_name": "chartsmith",
      "chart_version": "1.0.0",
      "channel_id": "...",
      "channel_name": "Stable",
      "status": "generated",
      "last_generated_at": "2026-02-23T12:00:00Z",
      "error": null
    }
  ]
}
The ContentTab UI shows these alongside TF modules.
Vendor-Web: ContentTab Updates

Add a "Helm References" section to ContentTab showing:

Chart name + version
Per-channel generation status
Last generated timestamp
Error messages (if any)
"Regenerate" button (re-enqueues the message)

Change Detection

Hash: SHA-256 of values.yaml content + Chart.yaml content (concatenated).
On promote:

Worker extracts chart, computes hash
Compares to content_hash on existing ep_helm_reference row
If unchanged → skip LLM call, update release_sequence only
If changed → full regeneration

This avoids unnecessary LLM calls when a release is promoted but the chart content hasn't changed (e.g., only application code changed, not Helm values).
Feature Gating

New feature flag: ep_helm_reference (or reuse ep_customization_terraform if we want a single EP docs flag).
Gates:

Promote handler enqueue logic
Worker processing
Market-API endpoints
ContentTab Helm References section

Error Handling


Scenario
Behavior


No content repo configured
Skip silently (no toc.yaml to check)


No helm_chart: in toc.yaml
Skip silently


Chart not found in release
Store error on ep_helm_reference, show in ContentTab


Multiple chart versions, no version in toc.yaml
Store error with disambiguation message


Chart extraction fails
Store error, log details


LLM call fails
Store deterministic-only reference (paths + defaults, no descriptions), set error flag


LLM returns invalid JSON
Fall back to deterministic-only, store error


Sequence Diagram

Vendor promotes release
    │
    ▼
release_promote.go
    │ enqueue per channel
    ▼
SQS: helm-chart-reference
    │
    ▼
ep-reference-generate worker
    │
    ├─ Fetch toc.yaml from ep_content_toc
    ├─ Find helm_chart: entries
    ├─ For each chart:
    │   ├─ Resolve chart in kots_channel_release_chart
    │   ├─ Get release_sequence from kots_channel_release
    │   ├─ Extract chart from release spec (GetSpecFiles → tarball)
    │   ├─ Compute content hash → change detection
    │   ├─ Parse values.yaml deterministically
    │   ├─ Call Claude with values.yaml + Chart.yaml + README.md
    │   ├─ Merge deterministic + LLM → HelmChartReference JSON
    │   └─ Store in ep_helm_reference
    │
    ▼
Market-API serves reference JSON
    │
    ▼
EP frontend renders HelmChartReferenceView

Open Items


Feature flag naming — new ep_helm_reference or reuse existing EP flag?
Release sequence lookup — need to confirm the exact query path from (channel_id, channel_sequence) → release_sequence via kots_channel_release
Token budget — large values.yaml files (500+ lines) may need truncation or splitting. Most charts should be well within Sonnet's context window with just values.yaml + Chart.yaml + README.md.
Regenerate trigger — besides promote, should vendors be able to manually trigger regeneration from ContentTab? (Useful if LLM output needs improvement.)
Field	Required	Description
`helm_chart.name`	Yes	Chart name as it appears in `Chart.yaml`
`helm_chart.version`	No	Specific chart version. If omitted, the release must contain exactly one version of this chart name.
Column	Type	Description
`id`	`varchar(27)`	KSUID primary key
`app_id`	`varchar(255)`	FK to app
`chart_name`	`varchar(255)`	Chart name from toc.yaml
`channel_id`	`varchar(255)`	Channel this reference is for
`chart_version`	`varchar(255)`	Resolved chart version
`release_sequence`	`bigint`	Source release
`reference_json`	`mediumtext`	Structured JSON (see schema)
`content_hash`	`varchar(64)`	SHA-256 of values.yaml + Chart.yaml (change detection)
`last_generated_at`	`datetime`	When reference was last generated
`last_generation_error`	`text`	Error message if generation failed
`created_at`	`datetime`
`updated_at`	`datetime`
Scenario	Behavior
No content repo configured	Skip silently (no toc.yaml to check)
No `helm_chart:` in toc.yaml	Skip silently
Chart not found in release	Store error on `ep_helm_reference`, show in ContentTab
Multiple chart versions, no version in toc.yaml	Store error with disambiguation message
Chart extraction fails	Store error, log details
LLM call fails	Store deterministic-only reference (paths + defaults, no descriptions), set error flag
LLM returns invalid JSON	Fall back to deterministic-only, store error