justsml/SKILL.md

## SKILL.md

      
    Raw
  

              SKILL.md
            
          
  name
  description
  
  
  llm-provider
  Implement, upgrade, or refactor JS/TS projects to use configurable LLM Provider + Model via llm:// URI strings — unifying scattered ENV vars, hardcoded model slugs, DB options, and AI SDK calls into a single composable resolution chain. Use when setting up LLM connections, migrating from env-only config, adding org-scoped encrypted credentials, or replacing multi-variable AI config with LLM strings.
  
  
LLM Provider & Model Configuration Skill

Implement or refactor JS/TS AI projects to use LLM strings — composable llm:// URIs that unify provider, model, credentials, and inference options (temperature, maxTokens, etc.) from any source (ENV, database, URL slug, job config, org connection) into a single resolution chain that produces an AI SDK model instance.
When to Use


User wants to "implement LLM strings" or "add llm:// URI support"
Project has ≥3 separate ENV vars for a single LLM call (e.g. MODEL, API_KEY, TEMPERATURE)
Code mixes generateText/streamText calls with provider construction inline (openai("gpt-4o"))
User wants org-scoped or user-scoped LLM credentials stored in DB with encryption
User wants model overrides configurable per-job, per-request, or per-feature without changing function signatures
User says "unify model config", "LLM connection management", "configurable AI provider"
User is refactoring away from OPENAI_MODEL=gpt-4o + OPENAI_TEMPERATURE=0.7 style config

When NOT to Use


Project has a single hardcoded model and no plans to make it configurable
User just wants to make a single AI call — answer directly
Refactoring is purely about prompt engineering, not provider/model configuration
User is using a non-Vercel AI SDK (e.g. raw OpenAI SDK, LangChain) — adapt patterns accordingly but note the difference


The llm:// URI Format

Spec: https://danlevy.net/llm-connection-strings/
Package: pnpm add llm-strings
Mirrors database URL patterns — one string replaces many scattered ENV vars.
Full Grammar

llm://[appName:apiKey@]<host>/<model>[?<params>]


Component
Examples
Notes


appName:apiKey@
myapp:sk-proj-abc@
Optional auth prefix — app name + API key separated by :


host
api.openai.com, api.anthropic.com, generativelanguage.googleapis.com
Identifies the provider


model
gpt-5-mini, claude-sonnet-4-5, gemini-2.5-flash
Model ID


?temp=
?temp=0.7
Temperature (0.0–2.0)


?max_tokens=
?max_tokens=4096
Max output tokens


?topP=
?topP=0.9
Top-P sampling


?seed=
?seed=42
Reproducibility seed


?cache=
?cache=true, ?cache=5m
Prompt caching (Anthropic/Bedrock)


?effort=
?effort=high
Reasoning effort (o1/o3/o4 models)


Shorthand Aliases

All shorthands expand automatically during normalize():


Shorthand
Canonical


temp
temperature


max, max_out, maxTokens
max_tokens


topp, topP, nucleus
top_p


topk, topK
top_k


freq, freq_penalty
frequency_penalty


pres, pres_penalty
presence_penalty


stop_sequences, stopSequences
stop


reasoning, reasoning_effort
effort


cache_control, cacheControl
cache


Supported Providers


Provider
Host
Param style


OpenAI
api.openai.com
snake_case


Anthropic
api.anthropic.com
snake_case


Google
generativelanguage.googleapis.com
camelCase


Mistral
api.mistral.ai
snake_case


Cohere
api.cohere.com
snake_case


AWS Bedrock
bedrock-runtime.{region}.amazonaws.com
camelCase


OpenRouter
openrouter.ai
snake_case


Vercel AI
gateway.ai.vercel.sh
snake_case


normalize() handles all provider-specific param name differences automatically.

llm-strings Package API

import { parse, build, normalize, validate, detectProvider, detectBedrockModelFamily } from "llm-strings";
import type { LlmConnectionConfig, NormalizeResult, ValidationIssue, Provider } from "llm-strings";
parse(connectionString): LlmConnectionConfig

Splits the URI into structured parts. Throws if scheme is not llm://.
const config = parse("llm://api.openai.com/gpt-5.2?temp=0.7&max=2000");
// {
//   raw:    "llm://api.openai.com/gpt-5.2?temp=0.7&max=2000",
//   host:   "api.openai.com",
//   model:  "gpt-5.2",
//   label:  undefined,
//   apiKey: undefined,
//   params: { temp: "0.7", max: "2000" }   // raw, un-normalized
// }

// With auth:
parse("llm://my-app:sk-proj-abc@api.openai.com/gpt-5.2?temp=0.7")
// → { label: "my-app", apiKey: "sk-proj-abc", host: "api.openai.com", ... }
LlmConnectionConfig type:
interface LlmConnectionConfig {
  raw:     string;             // original string
  host:    string;             // provider hostname
  model:   string;             // model ID
  label?:  string;             // optional app name
  apiKey?: string;             // optional API key (from userinfo position)
  params:  Record<string, string>; // query params, values always strings
}
normalize(config, options?): NormalizeResult

The key function. Expands aliases, maps to provider-specific param names, handles special cases. Call this before passing params to any API.
const { config: normalized, provider, changes } = normalize(parse(llmString));
// provider: "openai" | "anthropic" | "google" | "mistral" | "cohere" | "bedrock" | "openrouter" | "vercel" | undefined

// Provider normalization differences:
// OpenAI:    { temperature: "0.7", max_tokens: "2000",        top_p: "0.9" }
// Anthropic: { temperature: "0.7", max_tokens: "2000",        top_p: "0.9" }
// Google:    { temperature: "0.7", maxOutputTokens: "2000",   topP:  "0.9" }
// Bedrock:   { temperature: "0.7", maxTokens: "2000" }  (camelCase via Converse API)
Special normalizations:

OpenAI reasoning models (o1/o3/o4): max_tokens → max_completion_tokens, blocks unsupported sampling params
Anthropic cache=true → cache_control=ephemeral
Anthropic cache=5m → cache_control=ephemeral + cache_ttl=5m

Pass { verbose: true } to get a changes array showing each transformation:
const { changes } = normalize(config, { verbose: true });
// [{ from: "temp", to: "temperature", value: "0.7", reason: 'alias: "temp" → "temperature"' }, ...]
validate(connectionString, options?): ValidationIssue[]

Parses + normalizes + checks against provider specs. Returns [] if valid.
const issues = validate("llm://api.anthropic.com/claude-sonnet-4-5?temp=0.7&top_p=0.9");
// [{ param: "temperature", severity: "error",
//    message: 'Cannot specify both "temperature" and "top_p" for Anthropic models.' }]

const issues = validate("llm://api.openai.com/o3?temp=0.7");
// [{ severity: "error", message: '"temperature" is not supported by OpenAI reasoning model "o3".' }]
Checks: type correctness, value ranges, mutual exclusions, reasoning model restrictions, Bedrock model family constraints.
{ strict: true } promotes warnings (unknown provider, unknown params) to errors.
ValidationIssue type:
interface ValidationIssue {
  param:    string;
  value:    string;
  message:  string;
  severity: "error" | "warning";
}
build(config): string

Inverse of parse() — reconstructs a URI from a config object.
build({
  host: "api.openai.com", model: "gpt-5.2",
  label: "my-app", apiKey: "sk-proj-abc",
  params: { temperature: "0.7", max_tokens: "2000" },
});
// → "llm://my-app:sk-proj-abc@api.openai.com/gpt-5.2?temperature=0.7&max_tokens=2000"
detectProvider(host): Provider | undefined

detectProvider("api.openai.com")   // → "openai"
detectProvider("api.anthropic.com") // → "anthropic"
detectProvider("bedrock-runtime.us-east-1.amazonaws.com") // → "bedrock"
detectBedrockModelFamily(model): BedrockModelFamily | undefined

detectBedrockModelFamily("anthropic.claude-sonnet-4-5-20250929-v1:0") // → "anthropic"
detectBedrockModelFamily("us.anthropic.claude-sonnet-4-5-20250929-v1:0") // → "anthropic" (cross-region)
detectBedrockModelFamily("meta.llama3-8b-instruct-v1:0")               // → "meta"
Typical usage with Vercel AI SDK

import { parse, normalize } from "llm-strings";
import { openai } from "@ai-sdk/openai";
import { anthropic } from "@ai-sdk/anthropic";
import { google } from "@ai-sdk/google";

function resolveFromLLMString(llmString: string) {
  if (!llmString.startsWith("llm://")) {
    // simple slug — fall back to your provider detection logic
    return { model: detectAndCreate(llmString), params: {} };
  }

  const raw = parse(llmString);
  const { config, provider } = normalize(raw);

  const model = (() => {
    switch (provider) {
      case "openai":     return openai(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
      case "anthropic":  return anthropic(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
      case "google":     return google(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
      case "openrouter": return openrouter(config.model);
      default:           return openai(config.model); // fallback
    }
  })();

  // Convert normalized string params to numbers where applicable
  const params = Object.fromEntries(
    Object.entries(config.params).map(([k, v]) => [k, isNaN(+v) ? v : +v])
  );

  return { model, params };
}

// Usage:
const { model, params } = resolveFromLLMString(process.env.LLM_URL!);
const result = await generateText({
  model,
  temperature: params.temperature,
  maxTokens:   params.max_tokens ?? params.maxOutputTokens ?? params.maxTokens,
  topP:        params.top_p ?? params.topP,
  prompt,
});

Architecture: The Resolution Chain

Every LLM call resolves through this chain, from highest to lowest priority:
Job/Request options override
        ↓
DB LLM connection (org-scoped, encrypted credentials)
        ↓
LLM string / model slug (from job config, URL, or default)
        ↓
ENV API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
        ↓
AI SDK provider instance (openai(), anthropic(), google(), etc.)
        ↓
streamText / generateText / generateObject / streamObject

Resolution in Code

// src/lib/llm/resolver.ts
import { parse, validate } from "llm-strings";
import { openai, createOpenAI } from "@ai-sdk/openai";
import { anthropic } from "@ai-sdk/anthropic";
import { google } from "@ai-sdk/google";

export function resolveModelInstance(llmString: string) {
  // Simple slug — use provider detection
  if (!llmString.startsWith("llm://")) {
    return detectAndCreate(llmString);
  }

  const config = parse(llmString);
  // config.model, config.params.temp, config.params.maxTokens, etc.

  // Options from URI params (passed as AI SDK settings on the call site)
  const model = detectAndCreate(config.model, config.params.apiKey);
  return { model, params: config.params };
}

function detectAndCreate(modelId: string, apiKey?: string) {
  const opts = apiKey ? { apiKey } : {};
  if (modelId.startsWith("gpt") || modelId.includes("openai")) return openai(modelId, opts);
  if (modelId.startsWith("claude"))                               return anthropic(modelId, opts);
  if (modelId.startsWith("gemini"))                              return google(modelId, opts);
  if (modelId.startsWith("openai/"))                             return openai(modelId.slice(7), opts);
  if (modelId.startsWith("anthropic/"))                          return anthropic(modelId.slice(10), opts);
  if (modelId.startsWith("google/"))                             return google(modelId.slice(7), opts);
  if (modelId.startsWith("local/"))                              return createOpenAI({ baseURL: "http://localhost:1234/v1", apiKey: "lm-studio" })(modelId.slice(6));
  return openai(modelId, opts); // fallback
}

Implementation Patterns

Pattern 1: Simple Unified Resolver (Start Here)

Replace scattered openai("gpt-4o"), anthropic("claude-3") calls with a single resolver:
// Before: scattered provider construction
const model = process.env.PROVIDER === "openai"
  ? openai(process.env.OPENAI_MODEL ?? "gpt-4o")
  : anthropic(process.env.ANTHROPIC_MODEL ?? "claude-3");

const result = await generateText({
  model,
  temperature: parseFloat(process.env.TEMPERATURE ?? "0.7"),
  maxTokens: parseInt(process.env.MAX_TOKENS ?? "4096"),
  prompt,
});

// After: single resolver + LLM string
const llmString = process.env.LLM_STRING ?? "gpt-5-mini"; // or "llm://api.openai.com/gpt-5-mini?temp=0.7"
const { model, params } = resolveModelInstance(llmString);

const result = await generateText({
  model,
  temperature: params?.temp,
  maxTokens: params?.maxTokens,
  prompt,
});
Pattern 2: Per-Job / Per-Request Override

Options stored in DB JSONB (or passed as request params) override defaults:
// In a job queue handler:
async function processJob(job: { primaryModel?: string; options: JobOptions }) {
  const llmString = job.options.textExtraction
    ?? job.primaryModel
    ?? process.env.LLM_STRING
    ?? "gpt-5-mini";

  const { model, params } = resolveModelInstance(llmString);

  const result = await streamText({
    model,
    temperature: params?.temp ?? 0.2,
    maxTokens: params?.maxTokens ?? 4096,
    messages: [...],
  });
}
Pattern 3: Org-Scoped Encrypted Connections (DB + Encryption)

For multi-tenant apps where each org has its own API key:
// DB schema (Drizzle example)
export const llmConnections = pgTable("llm_connections", {
  id:                   text("id").primaryKey(),
  organizationId:       text("organization_id").notNull(),
  name:                 text("name").notNull(),
  llmString:            text("llm_string").notNull(), // WITHOUT apiKey: "llm://api.openai.com/gpt-5-mini?temp=0.3"
  encryptedCredentials: text("encrypted_credentials"), // AES-256-GCM encrypted API key
  isActive:             boolean("is_active").default(true),
  lastUsedAt:           timestamp("last_used_at"),
});

// Resolution with DB connection
async function resolveJobModel(job: { primaryModel?: string; llmConnectionId?: string }): Promise<LanguageModel> {
  let llmString = job.primaryModel ?? "gpt-5-mini";

  if (job.llmConnectionId) {
    const [conn] = await db.select().from(llmConnections)
      .where(and(eq(llmConnections.id, job.llmConnectionId), eq(llmConnections.isActive, true)))
      .limit(1);

    if (conn) {
      llmString = conn.llmString;
      if (conn.encryptedCredentials) {
        const apiKey = decrypt(conn.encryptedCredentials); // AES-256-GCM decrypt
        // Inject apiKey into URI: "llm://...?apiKey=sk-..."
        llmString = llmString.includes("?")
          ? `${llmString}&apiKey=${apiKey}`
          : `${llmString}?apiKey=${apiKey}`;
      }
      // Fire-and-forget: update lastUsedAt
      db.update(llmConnections).set({ lastUsedAt: new Date() })
        .where(eq(llmConnections.id, conn.id)).catch(() => {});
    }
  }

  const { model } = resolveModelInstance(llmString);
  return model;
}
Pattern 4: Feature-Specific Model Overrides

Different pipeline stages can use different models via the same JSONB options:
// Job options: all optional, each feature can specify its own model
interface JobOptions {
  primaryModel?: string;       // Main extraction model
  summarization?: string | boolean; // "claude-sonnet-4-5" or true (use primary) or false (skip)
  graphRAG?: string | boolean;
  routingModelSimple?: string; // Fast model for simple pages
  routingModelComplex?: string; // Powerful model for complex pages
  secondPassModel?: string;    // Model for verification pass
}

// In the summarize stage:
function getSummarizationModel(job: { primaryModel?: string; options: JobOptions }): string {
  const opt = job.options.summarization;
  if (!opt || opt === false) throw new Error("Summarization disabled");
  if (typeof opt === "string") return opt; // Feature-specific override
  return job.primaryModel ?? "gpt-5-mini"; // Fall back to primary
}
Pattern 5: Multi-Model Comma List (Ensemble)

// Parse comma-separated model list for ensemble extraction
function parseModelList(primaryModel: string = "gpt-5-mini"): {
  extractionModels: string[];
  mergeModel: string;
} {
  const models = primaryModel.split(",").map(m => m.trim()).filter(Boolean);
  return {
    extractionModels: models.length ? models : ["gpt-5-mini"],
    mergeModel: models[0] ?? "gpt-5-mini",
  };
}

// Usage: "gpt-5-mini,claude-sonnet-4-5,gemini-2.5-flash"
// → runs all 3 in parallel, then merges with mergeModel (first one)
Pattern 6: AI SDK Call with Extracted Params

Always pass params from the parsed URI through to the AI SDK call:
async function callLLM(llmString: string, prompt: string) {
  const { model, params } = resolveModelInstance(llmString);

  // params come from ?temp=0.7&maxTokens=4000 in the URI
  return generateText({
    model,
    prompt,
    temperature: params?.temp !== undefined ? Number(params.temp) : undefined,
    maxTokens: params?.maxTokens !== undefined ? Number(params.maxTokens) : undefined,
    topP: params?.topP !== undefined ? Number(params.topP) : undefined,
    seed: params?.seed !== undefined ? Number(params.seed) : undefined,
  });
}
Pattern 7: Encryption Utilities

// src/lib/encryption.ts
import { createCipheriv, createDecipheriv, randomBytes } from "crypto";

const KEY = Buffer.from(process.env.LLM_CREDENTIALS_KEY!, "hex"); // 32-byte key

export function encrypt(plaintext: string): string {
  const iv = randomBytes(12);
  const cipher = createCipheriv("aes-256-gcm", KEY, iv);
  const encrypted = Buffer.concat([cipher.update(plaintext, "utf8"), cipher.final()]);
  const tag = cipher.getAuthTag();
  return [iv.toString("hex"), encrypted.toString("hex"), tag.toString("hex")].join(":");
}

export function decrypt(ciphertext: string): string {
  const [ivHex, encHex, tagHex] = ciphertext.split(":");
  const decipher = createDecipheriv("aes-256-gcm", KEY, Buffer.from(ivHex, "hex"));
  decipher.setAuthTag(Buffer.from(tagHex, "hex"));
  return decipher.update(Buffer.from(encHex, "hex")) + decipher.final("utf8");
}

export function reassembleWithCredentials(llmString: string, apiKey: string): string {
  const sep = llmString.includes("?") ? "&" : "?";
  return `${llmString}${sep}apiKey=${encodeURIComponent(apiKey)}`;
}

Audit Checklist: Finding Scattered LLM Config

Before implementing, search the codebase for patterns to consolidate:
# Grep for common anti-patterns
grep -rn "process.env.OPENAI_MODEL\|process.env.ANTHROPIC_MODEL\|process.env.GOOGLE_MODEL" src/
grep -rn "openai(\"gpt\|anthropic(\"claude\|google(\"gemini" src/ --include="*.ts"
grep -rn "temperature:\|maxTokens:\|max_tokens:" src/ --include="*.ts"
grep -rn "generateText\|streamText\|generateObject\|streamObject" src/ --include="*.ts"
grep -rn "OPENAI_API_KEY\|ANTHROPIC_API_KEY\|GOOGLE_API_KEY" src/ --include="*.ts"
Red flags to consolidate:

Multiple process.env.*MODEL vars → one LLM_STRING or per-feature LLM string
Provider constructed inline at call site → use resolver
Temperature/maxTokens in ENV → embed in llm://...?temp=X&maxTokens=Y
if provider === "openai" ... else if provider === "anthropic" branches → resolver handles this


ENV Variable Migration

Before (scattered):

OPENAI_MODEL=gpt-4o
OPENAI_TEMPERATURE=0.7
OPENAI_MAX_TOKENS=4096
ANTHROPIC_MODEL=claude-3-sonnet
SUMMARY_MODEL=claude-3-haiku
After (unified):

LLM_STRING=llm://api.openai.com/gpt-5-mini?temp=0.7&maxTokens=4096
LLM_SUMMARY_STRING=llm://api.anthropic.com/claude-haiku-4-5?temp=0.3
LLM_CREDENTIALS_KEY=<32-byte hex for encrypted org connections>

# Provider API keys remain as-is (used as fallback when no apiKey in URI)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_GENERATIVE_AI_API_KEY=...

DB Schema for LLM Config (Drizzle + PostgreSQL)

Two tables cover most use cases:
llm_configs — User-scoped saved presets

export const llmConfigs = pgTable("llm_configs", {
  id:          text("id").primaryKey().$defaultFn(ulid),
  userId:      text("user_id").notNull().references(() => users.id),
  name:        text("name").notNull(),          // "Fast & Cheap", "High Quality"
  llmString:   text("llm_string").notNull(),    // "llm://api.openai.com/gpt-5-mini?temp=0.3"
  description: text("description"),
  isDefault:   boolean("is_default").default(false),
  createdAt:   timestamp("created_at").defaultNow(),
});
llm_connections — Org-scoped with encrypted API keys

export const llmConnections = pgTable("llm_connections", {
  id:                   text("id").primaryKey().$defaultFn(ulid),
  organizationId:       text("organization_id").notNull().references(() => organizations.id),
  name:                 text("name").notNull(),
  llmString:            text("llm_string").notNull(),   // WITHOUT apiKey in URI
  encryptedCredentials: text("encrypted_credentials"),  // AES-256-GCM
  isActive:             boolean("is_active").default(true),
  isDefault:            boolean("is_default").default(false),
  lastUsedAt:           timestamp("last_used_at"),
  lastFailAt:           timestamp("last_fail_at"),
  lastFailError:        text("last_fail_error"),
  createdAt:            timestamp("created_at").defaultNow(),
  updatedAt:            timestamp("updated_at").defaultNow(),
});

Frontend: Model Selection UI

Inline Connection Creation Pattern

When a user uploads/creates a job, they should be able to:

Pick from built-in model slugs (no API key needed if ENV key set)
Pick from saved org connections (pre-configured API keys)
Create a new connection inline (name + API key → encrypted → saved to org)

// ModelSelector component
type ModelOption =
  | { type: "builtin"; model: string; label: string }
  | { type: "connection"; connectionId: string; label: string; model: string };

function ModelSelector({ value, onChange, connections }: {
  value: string;
  onChange: (model: string, connectionId?: string) => void;
  connections: OrgConnection[];
}) {
  return (
    <select onChange={e => {
      const opt = JSON.parse(e.target.value);
      if (opt.type === "builtin") onChange(opt.model);
      else onChange(opt.model, opt.connectionId);
    }}>
      <optgroup label="Built-in Models">
        {BUILTIN_MODELS.map(m => (
          <option key={m.id} value={JSON.stringify({ type: "builtin", model: m.id })}>
            {m.label} ({m.provider})
          </option>
        ))}
      </optgroup>
      {connections.length > 0 && (
        <optgroup label="Org Connections">
          {connections.map(c => (
            <option key={c.id} value={JSON.stringify({ type: "connection", connectionId: c.id, model: c.model })}>
              {c.name} ({c.model})
            </option>
          ))}
        </optgroup>
      )}
    </select>
  );
}

Cost Tracking Pattern

After every AI SDK call, extract usage and calculate cost:
// Extract from AI SDK result (handles v4 and v5 field naming differences)
const usage = await result.usage;
const promptTokens    = (usage as any).inputTokens  ?? (usage as any).promptTokens    ?? 0;
const completionTokens= (usage as any).outputTokens ?? (usage as any).completionTokens ?? 0;
const cachedTokens    = (usage as any).cachedInputTokens ?? (usage as any).cacheReadInputTokens ?? 0;
const totalTokens     = usage.totalTokens ?? promptTokens + completionTokens;
const modelName       = (result as any).modelId ?? llmString;
const estimatedCost   = calculateCost(modelName, promptTokens, completionTokens, cachedTokens);

Rate Limit & Retry Pattern

async function callWithRetry(llmString: string, fn: (model: LanguageModel) => Promise<any>, maxRetries = 2) {
  const { model } = resolveModelInstance(llmString);
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn(model);
    } catch (err: any) {
      const is429 = err?.status === 429 || err?.message?.includes("rate limit");
      if (!is429 || attempt === maxRetries) throw err;
      // Parse "retry in Xs" from error message
      const retryMatch = err?.message?.match(/retry in (\d+)s/i);
      const delay = retryMatch ? parseInt(retryMatch[1]) * 1000 : 5000 * (attempt + 1);
      console.warn(`[LLM] Rate limited, retrying in ${delay}ms...`);
      await new Promise(r => setTimeout(r, delay));
    }
  }
}

Quick-Start: New Project Setup

pnpm add llm-strings @ai-sdk/openai @ai-sdk/anthropic @ai-sdk/google ai
Minimum files to create:

src/lib/llm/resolver.ts — resolveModelInstance(llmString) → { model, params }
src/lib/encryption.ts — encrypt()/decrypt()/reassembleWithCredentials() (if using DB connections)
src/lib/llm/pricing.ts — cost calculation table keyed by model name
src/db/schema.ts additions — llmConnections and/or llmConfigs tables


Implementation Steps

When asked to implement LLM strings in a project:

Audit — Run the grep checklist above. List all scattered LLM config locations.
Install — pnpm add llm-strings if not already present.
Write resolver — src/lib/llm/resolver.ts with resolveModelInstance().
Write encryption — src/lib/encryption.ts if org connections needed.
Update DB schema — Add llmConnections/llmConfigs tables. Generate migration.
Refactor call sites — Replace inline provider construction with resolver.
Update ENV — Replace multi-var config with LLM_STRING (+ LLM_CREDENTIALS_KEY if using DB).
Add API endpoints — CRUD for connections + test endpoint.
Build frontend selector — Model/connection picker with inline create.
Test end-to-end — Create connection, submit job, verify resolution chain logs.

After each major step, log the resolved model name and any options applied so the chain is observable.

Common Mistakes to Avoid


Mistake
Fix


Storing apiKey in plain text in DB
Always encrypt with AES-256-GCM; only store encrypted blob


Storing apiKey in the llmString column
Store URI without apiKey; inject at resolution time


Ignoring params from parsed URI
Pass temp, maxTokens etc. through to the AI SDK call


Multiple provider env vars → multiple code branches
Resolver handles all providers; ENV keys are fallback only


Passing model name string to calculateCost() without normalizing
Strip provider prefix first: "openai/gpt-5-mini" → "gpt-5-mini"


Forgetting result.usage is a Promise in streaming calls
Always await result.usage after consuming the stream


Hardcoding temperature at call site when URI has ?temp=
Read params.temp from resolver result first
Component	Examples	Notes
`appName:apiKey@`	`myapp:sk-proj-abc@`	Optional auth prefix — app name + API key separated by `:`
`host`	`api.openai.com`, `api.anthropic.com`, `generativelanguage.googleapis.com`	Identifies the provider
`model`	`gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash`	Model ID
`?temp=`	`?temp=0.7`	Temperature (0.0–2.0)
`?max_tokens=`	`?max_tokens=4096`	Max output tokens
`?topP=`	`?topP=0.9`	Top-P sampling
`?seed=`	`?seed=42`	Reproducibility seed
`?cache=`	`?cache=true`, `?cache=5m`	Prompt caching (Anthropic/Bedrock)
`?effort=`	`?effort=high`	Reasoning effort (o1/o3/o4 models)
Shorthand	Canonical
`temp`	`temperature`
`max`, `max_out`, `maxTokens`	`max_tokens`
`topp`, `topP`, `nucleus`	`top_p`
`topk`, `topK`	`top_k`
`freq`, `freq_penalty`	`frequency_penalty`
`pres`, `pres_penalty`	`presence_penalty`
`stop_sequences`, `stopSequences`	`stop`
`reasoning`, `reasoning_effort`	`effort`
`cache_control`, `cacheControl`	`cache`
Provider	Host	Param style
OpenAI	`api.openai.com`	snake_case
Anthropic	`api.anthropic.com`	snake_case
Google	`generativelanguage.googleapis.com`	camelCase
Mistral	`api.mistral.ai`	snake_case
Cohere	`api.cohere.com`	snake_case
AWS Bedrock	`bedrock-runtime.{region}.amazonaws.com`	camelCase
OpenRouter	`openrouter.ai`	snake_case
Vercel AI	`gateway.ai.vercel.sh`	snake_case
Mistake	Fix
Storing `apiKey` in plain text in DB	Always encrypt with AES-256-GCM; only store encrypted blob
Storing `apiKey` in the `llmString` column	Store URI without apiKey; inject at resolution time
Ignoring `params` from parsed URI	Pass `temp`, `maxTokens` etc. through to the AI SDK call
Multiple provider env vars → multiple code branches	Resolver handles all providers; ENV keys are fallback only
Passing model name string to `calculateCost()` without normalizing	Strip provider prefix first: `"openai/gpt-5-mini" → "gpt-5-mini"`
Forgetting `result.usage` is a Promise in streaming calls	Always `await result.usage` after consuming the stream
Hardcoding temperature at call site when URI has `?temp=`	Read `params.temp` from resolver result first