Skip to content

Instantly share code, notes, and snippets.

@justsml
Last active February 26, 2026 07:30
Show Gist options
  • Select an option

  • Save justsml/79550cd6669b6e58487e96eb0d37f25f to your computer and use it in GitHub Desktop.

Select an option

Save justsml/79550cd6669b6e58487e96eb0d37f25f to your computer and use it in GitHub Desktop.
llm-provider/SKILL.md
name description
llm-provider
Implement, upgrade, or refactor JS/TS projects to use configurable LLM Provider + Model via llm:// URI strings — unifying scattered ENV vars, hardcoded model slugs, DB options, and AI SDK calls into a single composable resolution chain. Use when setting up LLM connections, migrating from env-only config, adding org-scoped encrypted credentials, or replacing multi-variable AI config with LLM strings.

LLM Provider & Model Configuration Skill

Implement or refactor JS/TS AI projects to use LLM strings — composable llm:// URIs that unify provider, model, credentials, and inference options (temperature, maxTokens, etc.) from any source (ENV, database, URL slug, job config, org connection) into a single resolution chain that produces an AI SDK model instance.

When to Use

  • User wants to "implement LLM strings" or "add llm:// URI support"
  • Project has ≥3 separate ENV vars for a single LLM call (e.g. MODEL, API_KEY, TEMPERATURE)
  • Code mixes generateText/streamText calls with provider construction inline (openai("gpt-4o"))
  • User wants org-scoped or user-scoped LLM credentials stored in DB with encryption
  • User wants model overrides configurable per-job, per-request, or per-feature without changing function signatures
  • User says "unify model config", "LLM connection management", "configurable AI provider"
  • User is refactoring away from OPENAI_MODEL=gpt-4o + OPENAI_TEMPERATURE=0.7 style config

When NOT to Use

  • Project has a single hardcoded model and no plans to make it configurable
  • User just wants to make a single AI call — answer directly
  • Refactoring is purely about prompt engineering, not provider/model configuration
  • User is using a non-Vercel AI SDK (e.g. raw OpenAI SDK, LangChain) — adapt patterns accordingly but note the difference

The llm:// URI Format

Spec: https://danlevy.net/llm-connection-strings/ Package: pnpm add llm-strings

Mirrors database URL patterns — one string replaces many scattered ENV vars.

Full Grammar

llm://[appName:apiKey@]<host>/<model>[?<params>]
Component Examples Notes
appName:apiKey@ myapp:sk-proj-abc@ Optional auth prefix — app name + API key separated by :
host api.openai.com, api.anthropic.com, generativelanguage.googleapis.com Identifies the provider
model gpt-5-mini, claude-sonnet-4-5, gemini-2.5-flash Model ID
?temp= ?temp=0.7 Temperature (0.0–2.0)
?max_tokens= ?max_tokens=4096 Max output tokens
?topP= ?topP=0.9 Top-P sampling
?seed= ?seed=42 Reproducibility seed
?cache= ?cache=true, ?cache=5m Prompt caching (Anthropic/Bedrock)
?effort= ?effort=high Reasoning effort (o1/o3/o4 models)

Shorthand Aliases

All shorthands expand automatically during normalize():

Shorthand Canonical
temp temperature
max, max_out, maxTokens max_tokens
topp, topP, nucleus top_p
topk, topK top_k
freq, freq_penalty frequency_penalty
pres, pres_penalty presence_penalty
stop_sequences, stopSequences stop
reasoning, reasoning_effort effort
cache_control, cacheControl cache

Supported Providers

Provider Host Param style
OpenAI api.openai.com snake_case
Anthropic api.anthropic.com snake_case
Google generativelanguage.googleapis.com camelCase
Mistral api.mistral.ai snake_case
Cohere api.cohere.com snake_case
AWS Bedrock bedrock-runtime.{region}.amazonaws.com camelCase
OpenRouter openrouter.ai snake_case
Vercel AI gateway.ai.vercel.sh snake_case

normalize() handles all provider-specific param name differences automatically.


llm-strings Package API

import { parse, build, normalize, validate, detectProvider, detectBedrockModelFamily } from "llm-strings";
import type { LlmConnectionConfig, NormalizeResult, ValidationIssue, Provider } from "llm-strings";

parse(connectionString): LlmConnectionConfig

Splits the URI into structured parts. Throws if scheme is not llm://.

const config = parse("llm://api.openai.com/gpt-5.2?temp=0.7&max=2000");
// {
//   raw:    "llm://api.openai.com/gpt-5.2?temp=0.7&max=2000",
//   host:   "api.openai.com",
//   model:  "gpt-5.2",
//   label:  undefined,
//   apiKey: undefined,
//   params: { temp: "0.7", max: "2000" }   // raw, un-normalized
// }

// With auth:
parse("llm://my-app:sk-proj-abc@api.openai.com/gpt-5.2?temp=0.7")
// → { label: "my-app", apiKey: "sk-proj-abc", host: "api.openai.com", ... }

LlmConnectionConfig type:

interface LlmConnectionConfig {
  raw:     string;             // original string
  host:    string;             // provider hostname
  model:   string;             // model ID
  label?:  string;             // optional app name
  apiKey?: string;             // optional API key (from userinfo position)
  params:  Record<string, string>; // query params, values always strings
}

normalize(config, options?): NormalizeResult

The key function. Expands aliases, maps to provider-specific param names, handles special cases. Call this before passing params to any API.

const { config: normalized, provider, changes } = normalize(parse(llmString));
// provider: "openai" | "anthropic" | "google" | "mistral" | "cohere" | "bedrock" | "openrouter" | "vercel" | undefined

// Provider normalization differences:
// OpenAI:    { temperature: "0.7", max_tokens: "2000",        top_p: "0.9" }
// Anthropic: { temperature: "0.7", max_tokens: "2000",        top_p: "0.9" }
// Google:    { temperature: "0.7", maxOutputTokens: "2000",   topP:  "0.9" }
// Bedrock:   { temperature: "0.7", maxTokens: "2000" }  (camelCase via Converse API)

Special normalizations:

  • OpenAI reasoning models (o1/o3/o4): max_tokensmax_completion_tokens, blocks unsupported sampling params
  • Anthropic cache=truecache_control=ephemeral
  • Anthropic cache=5mcache_control=ephemeral + cache_ttl=5m

Pass { verbose: true } to get a changes array showing each transformation:

const { changes } = normalize(config, { verbose: true });
// [{ from: "temp", to: "temperature", value: "0.7", reason: 'alias: "temp" → "temperature"' }, ...]

validate(connectionString, options?): ValidationIssue[]

Parses + normalizes + checks against provider specs. Returns [] if valid.

const issues = validate("llm://api.anthropic.com/claude-sonnet-4-5?temp=0.7&top_p=0.9");
// [{ param: "temperature", severity: "error",
//    message: 'Cannot specify both "temperature" and "top_p" for Anthropic models.' }]

const issues = validate("llm://api.openai.com/o3?temp=0.7");
// [{ severity: "error", message: '"temperature" is not supported by OpenAI reasoning model "o3".' }]

Checks: type correctness, value ranges, mutual exclusions, reasoning model restrictions, Bedrock model family constraints.

{ strict: true } promotes warnings (unknown provider, unknown params) to errors.

ValidationIssue type:

interface ValidationIssue {
  param:    string;
  value:    string;
  message:  string;
  severity: "error" | "warning";
}

build(config): string

Inverse of parse() — reconstructs a URI from a config object.

build({
  host: "api.openai.com", model: "gpt-5.2",
  label: "my-app", apiKey: "sk-proj-abc",
  params: { temperature: "0.7", max_tokens: "2000" },
});
// → "llm://my-app:sk-proj-abc@api.openai.com/gpt-5.2?temperature=0.7&max_tokens=2000"

detectProvider(host): Provider | undefined

detectProvider("api.openai.com")   // → "openai"
detectProvider("api.anthropic.com") // → "anthropic"
detectProvider("bedrock-runtime.us-east-1.amazonaws.com") // → "bedrock"

detectBedrockModelFamily(model): BedrockModelFamily | undefined

detectBedrockModelFamily("anthropic.claude-sonnet-4-5-20250929-v1:0") // → "anthropic"
detectBedrockModelFamily("us.anthropic.claude-sonnet-4-5-20250929-v1:0") // → "anthropic" (cross-region)
detectBedrockModelFamily("meta.llama3-8b-instruct-v1:0")               // → "meta"

Typical usage with Vercel AI SDK

import { parse, normalize } from "llm-strings";
import { openai } from "@ai-sdk/openai";
import { anthropic } from "@ai-sdk/anthropic";
import { google } from "@ai-sdk/google";

function resolveFromLLMString(llmString: string) {
  if (!llmString.startsWith("llm://")) {
    // simple slug — fall back to your provider detection logic
    return { model: detectAndCreate(llmString), params: {} };
  }

  const raw = parse(llmString);
  const { config, provider } = normalize(raw);

  const model = (() => {
    switch (provider) {
      case "openai":     return openai(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
      case "anthropic":  return anthropic(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
      case "google":     return google(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
      case "openrouter": return openrouter(config.model);
      default:           return openai(config.model); // fallback
    }
  })();

  // Convert normalized string params to numbers where applicable
  const params = Object.fromEntries(
    Object.entries(config.params).map(([k, v]) => [k, isNaN(+v) ? v : +v])
  );

  return { model, params };
}

// Usage:
const { model, params } = resolveFromLLMString(process.env.LLM_URL!);
const result = await generateText({
  model,
  temperature: params.temperature,
  maxTokens:   params.max_tokens ?? params.maxOutputTokens ?? params.maxTokens,
  topP:        params.top_p ?? params.topP,
  prompt,
});

Architecture: The Resolution Chain

Every LLM call resolves through this chain, from highest to lowest priority:

Job/Request options override
        ↓
DB LLM connection (org-scoped, encrypted credentials)
        ↓
LLM string / model slug (from job config, URL, or default)
        ↓
ENV API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
        ↓
AI SDK provider instance (openai(), anthropic(), google(), etc.)
        ↓
streamText / generateText / generateObject / streamObject

Resolution in Code

// src/lib/llm/resolver.ts
import { parse, validate } from "llm-strings";
import { openai, createOpenAI } from "@ai-sdk/openai";
import { anthropic } from "@ai-sdk/anthropic";
import { google } from "@ai-sdk/google";

export function resolveModelInstance(llmString: string) {
  // Simple slug — use provider detection
  if (!llmString.startsWith("llm://")) {
    return detectAndCreate(llmString);
  }

  const config = parse(llmString);
  // config.model, config.params.temp, config.params.maxTokens, etc.

  // Options from URI params (passed as AI SDK settings on the call site)
  const model = detectAndCreate(config.model, config.params.apiKey);
  return { model, params: config.params };
}

function detectAndCreate(modelId: string, apiKey?: string) {
  const opts = apiKey ? { apiKey } : {};
  if (modelId.startsWith("gpt") || modelId.includes("openai")) return openai(modelId, opts);
  if (modelId.startsWith("claude"))                               return anthropic(modelId, opts);
  if (modelId.startsWith("gemini"))                              return google(modelId, opts);
  if (modelId.startsWith("openai/"))                             return openai(modelId.slice(7), opts);
  if (modelId.startsWith("anthropic/"))                          return anthropic(modelId.slice(10), opts);
  if (modelId.startsWith("google/"))                             return google(modelId.slice(7), opts);
  if (modelId.startsWith("local/"))                              return createOpenAI({ baseURL: "http://localhost:1234/v1", apiKey: "lm-studio" })(modelId.slice(6));
  return openai(modelId, opts); // fallback
}

Implementation Patterns

Pattern 1: Simple Unified Resolver (Start Here)

Replace scattered openai("gpt-4o"), anthropic("claude-3") calls with a single resolver:

// Before: scattered provider construction
const model = process.env.PROVIDER === "openai"
  ? openai(process.env.OPENAI_MODEL ?? "gpt-4o")
  : anthropic(process.env.ANTHROPIC_MODEL ?? "claude-3");

const result = await generateText({
  model,
  temperature: parseFloat(process.env.TEMPERATURE ?? "0.7"),
  maxTokens: parseInt(process.env.MAX_TOKENS ?? "4096"),
  prompt,
});

// After: single resolver + LLM string
const llmString = process.env.LLM_STRING ?? "gpt-5-mini"; // or "llm://api.openai.com/gpt-5-mini?temp=0.7"
const { model, params } = resolveModelInstance(llmString);

const result = await generateText({
  model,
  temperature: params?.temp,
  maxTokens: params?.maxTokens,
  prompt,
});

Pattern 2: Per-Job / Per-Request Override

Options stored in DB JSONB (or passed as request params) override defaults:

// In a job queue handler:
async function processJob(job: { primaryModel?: string; options: JobOptions }) {
  const llmString = job.options.textExtraction
    ?? job.primaryModel
    ?? process.env.LLM_STRING
    ?? "gpt-5-mini";

  const { model, params } = resolveModelInstance(llmString);

  const result = await streamText({
    model,
    temperature: params?.temp ?? 0.2,
    maxTokens: params?.maxTokens ?? 4096,
    messages: [...],
  });
}

Pattern 3: Org-Scoped Encrypted Connections (DB + Encryption)

For multi-tenant apps where each org has its own API key:

// DB schema (Drizzle example)
export const llmConnections = pgTable("llm_connections", {
  id:                   text("id").primaryKey(),
  organizationId:       text("organization_id").notNull(),
  name:                 text("name").notNull(),
  llmString:            text("llm_string").notNull(), // WITHOUT apiKey: "llm://api.openai.com/gpt-5-mini?temp=0.3"
  encryptedCredentials: text("encrypted_credentials"), // AES-256-GCM encrypted API key
  isActive:             boolean("is_active").default(true),
  lastUsedAt:           timestamp("last_used_at"),
});

// Resolution with DB connection
async function resolveJobModel(job: { primaryModel?: string; llmConnectionId?: string }): Promise<LanguageModel> {
  let llmString = job.primaryModel ?? "gpt-5-mini";

  if (job.llmConnectionId) {
    const [conn] = await db.select().from(llmConnections)
      .where(and(eq(llmConnections.id, job.llmConnectionId), eq(llmConnections.isActive, true)))
      .limit(1);

    if (conn) {
      llmString = conn.llmString;
      if (conn.encryptedCredentials) {
        const apiKey = decrypt(conn.encryptedCredentials); // AES-256-GCM decrypt
        // Inject apiKey into URI: "llm://...?apiKey=sk-..."
        llmString = llmString.includes("?")
          ? `${llmString}&apiKey=${apiKey}`
          : `${llmString}?apiKey=${apiKey}`;
      }
      // Fire-and-forget: update lastUsedAt
      db.update(llmConnections).set({ lastUsedAt: new Date() })
        .where(eq(llmConnections.id, conn.id)).catch(() => {});
    }
  }

  const { model } = resolveModelInstance(llmString);
  return model;
}

Pattern 4: Feature-Specific Model Overrides

Different pipeline stages can use different models via the same JSONB options:

// Job options: all optional, each feature can specify its own model
interface JobOptions {
  primaryModel?: string;       // Main extraction model
  summarization?: string | boolean; // "claude-sonnet-4-5" or true (use primary) or false (skip)
  graphRAG?: string | boolean;
  routingModelSimple?: string; // Fast model for simple pages
  routingModelComplex?: string; // Powerful model for complex pages
  secondPassModel?: string;    // Model for verification pass
}

// In the summarize stage:
function getSummarizationModel(job: { primaryModel?: string; options: JobOptions }): string {
  const opt = job.options.summarization;
  if (!opt || opt === false) throw new Error("Summarization disabled");
  if (typeof opt === "string") return opt; // Feature-specific override
  return job.primaryModel ?? "gpt-5-mini"; // Fall back to primary
}

Pattern 5: Multi-Model Comma List (Ensemble)

// Parse comma-separated model list for ensemble extraction
function parseModelList(primaryModel: string = "gpt-5-mini"): {
  extractionModels: string[];
  mergeModel: string;
} {
  const models = primaryModel.split(",").map(m => m.trim()).filter(Boolean);
  return {
    extractionModels: models.length ? models : ["gpt-5-mini"],
    mergeModel: models[0] ?? "gpt-5-mini",
  };
}

// Usage: "gpt-5-mini,claude-sonnet-4-5,gemini-2.5-flash"
// → runs all 3 in parallel, then merges with mergeModel (first one)

Pattern 6: AI SDK Call with Extracted Params

Always pass params from the parsed URI through to the AI SDK call:

async function callLLM(llmString: string, prompt: string) {
  const { model, params } = resolveModelInstance(llmString);

  // params come from ?temp=0.7&maxTokens=4000 in the URI
  return generateText({
    model,
    prompt,
    temperature: params?.temp !== undefined ? Number(params.temp) : undefined,
    maxTokens: params?.maxTokens !== undefined ? Number(params.maxTokens) : undefined,
    topP: params?.topP !== undefined ? Number(params.topP) : undefined,
    seed: params?.seed !== undefined ? Number(params.seed) : undefined,
  });
}

Pattern 7: Encryption Utilities

// src/lib/encryption.ts
import { createCipheriv, createDecipheriv, randomBytes } from "crypto";

const KEY = Buffer.from(process.env.LLM_CREDENTIALS_KEY!, "hex"); // 32-byte key

export function encrypt(plaintext: string): string {
  const iv = randomBytes(12);
  const cipher = createCipheriv("aes-256-gcm", KEY, iv);
  const encrypted = Buffer.concat([cipher.update(plaintext, "utf8"), cipher.final()]);
  const tag = cipher.getAuthTag();
  return [iv.toString("hex"), encrypted.toString("hex"), tag.toString("hex")].join(":");
}

export function decrypt(ciphertext: string): string {
  const [ivHex, encHex, tagHex] = ciphertext.split(":");
  const decipher = createDecipheriv("aes-256-gcm", KEY, Buffer.from(ivHex, "hex"));
  decipher.setAuthTag(Buffer.from(tagHex, "hex"));
  return decipher.update(Buffer.from(encHex, "hex")) + decipher.final("utf8");
}

export function reassembleWithCredentials(llmString: string, apiKey: string): string {
  const sep = llmString.includes("?") ? "&" : "?";
  return `${llmString}${sep}apiKey=${encodeURIComponent(apiKey)}`;
}

Audit Checklist: Finding Scattered LLM Config

Before implementing, search the codebase for patterns to consolidate:

# Grep for common anti-patterns
grep -rn "process.env.OPENAI_MODEL\|process.env.ANTHROPIC_MODEL\|process.env.GOOGLE_MODEL" src/
grep -rn "openai(\"gpt\|anthropic(\"claude\|google(\"gemini" src/ --include="*.ts"
grep -rn "temperature:\|maxTokens:\|max_tokens:" src/ --include="*.ts"
grep -rn "generateText\|streamText\|generateObject\|streamObject" src/ --include="*.ts"
grep -rn "OPENAI_API_KEY\|ANTHROPIC_API_KEY\|GOOGLE_API_KEY" src/ --include="*.ts"

Red flags to consolidate:

  • Multiple process.env.*MODEL vars → one LLM_STRING or per-feature LLM string
  • Provider constructed inline at call site → use resolver
  • Temperature/maxTokens in ENV → embed in llm://...?temp=X&maxTokens=Y
  • if provider === "openai" ... else if provider === "anthropic" branches → resolver handles this

ENV Variable Migration

Before (scattered):

OPENAI_MODEL=gpt-4o
OPENAI_TEMPERATURE=0.7
OPENAI_MAX_TOKENS=4096
ANTHROPIC_MODEL=claude-3-sonnet
SUMMARY_MODEL=claude-3-haiku

After (unified):

LLM_STRING=llm://api.openai.com/gpt-5-mini?temp=0.7&maxTokens=4096
LLM_SUMMARY_STRING=llm://api.anthropic.com/claude-haiku-4-5?temp=0.3
LLM_CREDENTIALS_KEY=<32-byte hex for encrypted org connections>

# Provider API keys remain as-is (used as fallback when no apiKey in URI)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_GENERATIVE_AI_API_KEY=...

DB Schema for LLM Config (Drizzle + PostgreSQL)

Two tables cover most use cases:

llm_configs — User-scoped saved presets

export const llmConfigs = pgTable("llm_configs", {
  id:          text("id").primaryKey().$defaultFn(ulid),
  userId:      text("user_id").notNull().references(() => users.id),
  name:        text("name").notNull(),          // "Fast & Cheap", "High Quality"
  llmString:   text("llm_string").notNull(),    // "llm://api.openai.com/gpt-5-mini?temp=0.3"
  description: text("description"),
  isDefault:   boolean("is_default").default(false),
  createdAt:   timestamp("created_at").defaultNow(),
});

llm_connections — Org-scoped with encrypted API keys

export const llmConnections = pgTable("llm_connections", {
  id:                   text("id").primaryKey().$defaultFn(ulid),
  organizationId:       text("organization_id").notNull().references(() => organizations.id),
  name:                 text("name").notNull(),
  llmString:            text("llm_string").notNull(),   // WITHOUT apiKey in URI
  encryptedCredentials: text("encrypted_credentials"),  // AES-256-GCM
  isActive:             boolean("is_active").default(true),
  isDefault:            boolean("is_default").default(false),
  lastUsedAt:           timestamp("last_used_at"),
  lastFailAt:           timestamp("last_fail_at"),
  lastFailError:        text("last_fail_error"),
  createdAt:            timestamp("created_at").defaultNow(),
  updatedAt:            timestamp("updated_at").defaultNow(),
});

Frontend: Model Selection UI

Inline Connection Creation Pattern

When a user uploads/creates a job, they should be able to:

  1. Pick from built-in model slugs (no API key needed if ENV key set)
  2. Pick from saved org connections (pre-configured API keys)
  3. Create a new connection inline (name + API key → encrypted → saved to org)
// ModelSelector component
type ModelOption =
  | { type: "builtin"; model: string; label: string }
  | { type: "connection"; connectionId: string; label: string; model: string };

function ModelSelector({ value, onChange, connections }: {
  value: string;
  onChange: (model: string, connectionId?: string) => void;
  connections: OrgConnection[];
}) {
  return (
    <select onChange={e => {
      const opt = JSON.parse(e.target.value);
      if (opt.type === "builtin") onChange(opt.model);
      else onChange(opt.model, opt.connectionId);
    }}>
      <optgroup label="Built-in Models">
        {BUILTIN_MODELS.map(m => (
          <option key={m.id} value={JSON.stringify({ type: "builtin", model: m.id })}>
            {m.label} ({m.provider})
          </option>
        ))}
      </optgroup>
      {connections.length > 0 && (
        <optgroup label="Org Connections">
          {connections.map(c => (
            <option key={c.id} value={JSON.stringify({ type: "connection", connectionId: c.id, model: c.model })}>
              {c.name} ({c.model})
            </option>
          ))}
        </optgroup>
      )}
    </select>
  );
}

Cost Tracking Pattern

After every AI SDK call, extract usage and calculate cost:

// Extract from AI SDK result (handles v4 and v5 field naming differences)
const usage = await result.usage;
const promptTokens    = (usage as any).inputTokens  ?? (usage as any).promptTokens    ?? 0;
const completionTokens= (usage as any).outputTokens ?? (usage as any).completionTokens ?? 0;
const cachedTokens    = (usage as any).cachedInputTokens ?? (usage as any).cacheReadInputTokens ?? 0;
const totalTokens     = usage.totalTokens ?? promptTokens + completionTokens;
const modelName       = (result as any).modelId ?? llmString;
const estimatedCost   = calculateCost(modelName, promptTokens, completionTokens, cachedTokens);

Rate Limit & Retry Pattern

async function callWithRetry(llmString: string, fn: (model: LanguageModel) => Promise<any>, maxRetries = 2) {
  const { model } = resolveModelInstance(llmString);
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn(model);
    } catch (err: any) {
      const is429 = err?.status === 429 || err?.message?.includes("rate limit");
      if (!is429 || attempt === maxRetries) throw err;
      // Parse "retry in Xs" from error message
      const retryMatch = err?.message?.match(/retry in (\d+)s/i);
      const delay = retryMatch ? parseInt(retryMatch[1]) * 1000 : 5000 * (attempt + 1);
      console.warn(`[LLM] Rate limited, retrying in ${delay}ms...`);
      await new Promise(r => setTimeout(r, delay));
    }
  }
}

Quick-Start: New Project Setup

pnpm add llm-strings @ai-sdk/openai @ai-sdk/anthropic @ai-sdk/google ai

Minimum files to create:

  1. src/lib/llm/resolver.tsresolveModelInstance(llmString){ model, params }
  2. src/lib/encryption.tsencrypt()/decrypt()/reassembleWithCredentials() (if using DB connections)
  3. src/lib/llm/pricing.ts — cost calculation table keyed by model name
  4. src/db/schema.ts additions — llmConnections and/or llmConfigs tables

Implementation Steps

When asked to implement LLM strings in a project:

  1. Audit — Run the grep checklist above. List all scattered LLM config locations.
  2. Installpnpm add llm-strings if not already present.
  3. Write resolversrc/lib/llm/resolver.ts with resolveModelInstance().
  4. Write encryptionsrc/lib/encryption.ts if org connections needed.
  5. Update DB schema — Add llmConnections/llmConfigs tables. Generate migration.
  6. Refactor call sites — Replace inline provider construction with resolver.
  7. Update ENV — Replace multi-var config with LLM_STRING (+ LLM_CREDENTIALS_KEY if using DB).
  8. Add API endpoints — CRUD for connections + test endpoint.
  9. Build frontend selector — Model/connection picker with inline create.
  10. Test end-to-end — Create connection, submit job, verify resolution chain logs.

After each major step, log the resolved model name and any options applied so the chain is observable.


Common Mistakes to Avoid

Mistake Fix
Storing apiKey in plain text in DB Always encrypt with AES-256-GCM; only store encrypted blob
Storing apiKey in the llmString column Store URI without apiKey; inject at resolution time
Ignoring params from parsed URI Pass temp, maxTokens etc. through to the AI SDK call
Multiple provider env vars → multiple code branches Resolver handles all providers; ENV keys are fallback only
Passing model name string to calculateCost() without normalizing Strip provider prefix first: "openai/gpt-5-mini" → "gpt-5-mini"
Forgetting result.usage is a Promise in streaming calls Always await result.usage after consuming the stream
Hardcoding temperature at call site when URI has ?temp= Read params.temp from resolver result first
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment