| name | description |
|---|---|
llm-provider |
Implement, upgrade, or refactor JS/TS projects to use configurable LLM Provider + Model via llm:// URI strings — unifying scattered ENV vars, hardcoded model slugs, DB options, and AI SDK calls into a single composable resolution chain. Use when setting up LLM connections, migrating from env-only config, adding org-scoped encrypted credentials, or replacing multi-variable AI config with LLM strings. |
Implement or refactor JS/TS AI projects to use LLM strings — composable llm:// URIs that unify provider, model, credentials, and inference options (temperature, maxTokens, etc.) from any source (ENV, database, URL slug, job config, org connection) into a single resolution chain that produces an AI SDK model instance.
- User wants to "implement LLM strings" or "add llm:// URI support"
- Project has ≥3 separate ENV vars for a single LLM call (e.g.
MODEL,API_KEY,TEMPERATURE) - Code mixes
generateText/streamTextcalls with provider construction inline (openai("gpt-4o")) - User wants org-scoped or user-scoped LLM credentials stored in DB with encryption
- User wants model overrides configurable per-job, per-request, or per-feature without changing function signatures
- User says "unify model config", "LLM connection management", "configurable AI provider"
- User is refactoring away from
OPENAI_MODEL=gpt-4o+OPENAI_TEMPERATURE=0.7style config
- Project has a single hardcoded model and no plans to make it configurable
- User just wants to make a single AI call — answer directly
- Refactoring is purely about prompt engineering, not provider/model configuration
- User is using a non-Vercel AI SDK (e.g. raw OpenAI SDK, LangChain) — adapt patterns accordingly but note the difference
Spec: https://danlevy.net/llm-connection-strings/
Package: pnpm add llm-strings
Mirrors database URL patterns — one string replaces many scattered ENV vars.
llm://[appName:apiKey@]<host>/<model>[?<params>]
| Component | Examples | Notes |
|---|---|---|
appName:apiKey@ |
myapp:sk-proj-abc@ |
Optional auth prefix — app name + API key separated by : |
host |
api.openai.com, api.anthropic.com, generativelanguage.googleapis.com |
Identifies the provider |
model |
gpt-5-mini, claude-sonnet-4-5, gemini-2.5-flash |
Model ID |
?temp= |
?temp=0.7 |
Temperature (0.0–2.0) |
?max_tokens= |
?max_tokens=4096 |
Max output tokens |
?topP= |
?topP=0.9 |
Top-P sampling |
?seed= |
?seed=42 |
Reproducibility seed |
?cache= |
?cache=true, ?cache=5m |
Prompt caching (Anthropic/Bedrock) |
?effort= |
?effort=high |
Reasoning effort (o1/o3/o4 models) |
All shorthands expand automatically during normalize():
| Shorthand | Canonical |
|---|---|
temp |
temperature |
max, max_out, maxTokens |
max_tokens |
topp, topP, nucleus |
top_p |
topk, topK |
top_k |
freq, freq_penalty |
frequency_penalty |
pres, pres_penalty |
presence_penalty |
stop_sequences, stopSequences |
stop |
reasoning, reasoning_effort |
effort |
cache_control, cacheControl |
cache |
| Provider | Host | Param style |
|---|---|---|
| OpenAI | api.openai.com |
snake_case |
| Anthropic | api.anthropic.com |
snake_case |
generativelanguage.googleapis.com |
camelCase | |
| Mistral | api.mistral.ai |
snake_case |
| Cohere | api.cohere.com |
snake_case |
| AWS Bedrock | bedrock-runtime.{region}.amazonaws.com |
camelCase |
| OpenRouter | openrouter.ai |
snake_case |
| Vercel AI | gateway.ai.vercel.sh |
snake_case |
normalize() handles all provider-specific param name differences automatically.
import { parse, build, normalize, validate, detectProvider, detectBedrockModelFamily } from "llm-strings";
import type { LlmConnectionConfig, NormalizeResult, ValidationIssue, Provider } from "llm-strings";Splits the URI into structured parts. Throws if scheme is not llm://.
const config = parse("llm://api.openai.com/gpt-5.2?temp=0.7&max=2000");
// {
// raw: "llm://api.openai.com/gpt-5.2?temp=0.7&max=2000",
// host: "api.openai.com",
// model: "gpt-5.2",
// label: undefined,
// apiKey: undefined,
// params: { temp: "0.7", max: "2000" } // raw, un-normalized
// }
// With auth:
parse("llm://my-app:sk-proj-abc@api.openai.com/gpt-5.2?temp=0.7")
// → { label: "my-app", apiKey: "sk-proj-abc", host: "api.openai.com", ... }LlmConnectionConfig type:
interface LlmConnectionConfig {
raw: string; // original string
host: string; // provider hostname
model: string; // model ID
label?: string; // optional app name
apiKey?: string; // optional API key (from userinfo position)
params: Record<string, string>; // query params, values always strings
}The key function. Expands aliases, maps to provider-specific param names, handles special cases. Call this before passing params to any API.
const { config: normalized, provider, changes } = normalize(parse(llmString));
// provider: "openai" | "anthropic" | "google" | "mistral" | "cohere" | "bedrock" | "openrouter" | "vercel" | undefined
// Provider normalization differences:
// OpenAI: { temperature: "0.7", max_tokens: "2000", top_p: "0.9" }
// Anthropic: { temperature: "0.7", max_tokens: "2000", top_p: "0.9" }
// Google: { temperature: "0.7", maxOutputTokens: "2000", topP: "0.9" }
// Bedrock: { temperature: "0.7", maxTokens: "2000" } (camelCase via Converse API)Special normalizations:
- OpenAI reasoning models (
o1/o3/o4):max_tokens→max_completion_tokens, blocks unsupported sampling params - Anthropic
cache=true→cache_control=ephemeral - Anthropic
cache=5m→cache_control=ephemeral+cache_ttl=5m
Pass { verbose: true } to get a changes array showing each transformation:
const { changes } = normalize(config, { verbose: true });
// [{ from: "temp", to: "temperature", value: "0.7", reason: 'alias: "temp" → "temperature"' }, ...]Parses + normalizes + checks against provider specs. Returns [] if valid.
const issues = validate("llm://api.anthropic.com/claude-sonnet-4-5?temp=0.7&top_p=0.9");
// [{ param: "temperature", severity: "error",
// message: 'Cannot specify both "temperature" and "top_p" for Anthropic models.' }]
const issues = validate("llm://api.openai.com/o3?temp=0.7");
// [{ severity: "error", message: '"temperature" is not supported by OpenAI reasoning model "o3".' }]Checks: type correctness, value ranges, mutual exclusions, reasoning model restrictions, Bedrock model family constraints.
{ strict: true } promotes warnings (unknown provider, unknown params) to errors.
ValidationIssue type:
interface ValidationIssue {
param: string;
value: string;
message: string;
severity: "error" | "warning";
}Inverse of parse() — reconstructs a URI from a config object.
build({
host: "api.openai.com", model: "gpt-5.2",
label: "my-app", apiKey: "sk-proj-abc",
params: { temperature: "0.7", max_tokens: "2000" },
});
// → "llm://my-app:sk-proj-abc@api.openai.com/gpt-5.2?temperature=0.7&max_tokens=2000"detectProvider("api.openai.com") // → "openai"
detectProvider("api.anthropic.com") // → "anthropic"
detectProvider("bedrock-runtime.us-east-1.amazonaws.com") // → "bedrock"detectBedrockModelFamily("anthropic.claude-sonnet-4-5-20250929-v1:0") // → "anthropic"
detectBedrockModelFamily("us.anthropic.claude-sonnet-4-5-20250929-v1:0") // → "anthropic" (cross-region)
detectBedrockModelFamily("meta.llama3-8b-instruct-v1:0") // → "meta"import { parse, normalize } from "llm-strings";
import { openai } from "@ai-sdk/openai";
import { anthropic } from "@ai-sdk/anthropic";
import { google } from "@ai-sdk/google";
function resolveFromLLMString(llmString: string) {
if (!llmString.startsWith("llm://")) {
// simple slug — fall back to your provider detection logic
return { model: detectAndCreate(llmString), params: {} };
}
const raw = parse(llmString);
const { config, provider } = normalize(raw);
const model = (() => {
switch (provider) {
case "openai": return openai(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
case "anthropic": return anthropic(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
case "google": return google(config.model, config.apiKey ? { apiKey: config.apiKey } : {});
case "openrouter": return openrouter(config.model);
default: return openai(config.model); // fallback
}
})();
// Convert normalized string params to numbers where applicable
const params = Object.fromEntries(
Object.entries(config.params).map(([k, v]) => [k, isNaN(+v) ? v : +v])
);
return { model, params };
}
// Usage:
const { model, params } = resolveFromLLMString(process.env.LLM_URL!);
const result = await generateText({
model,
temperature: params.temperature,
maxTokens: params.max_tokens ?? params.maxOutputTokens ?? params.maxTokens,
topP: params.top_p ?? params.topP,
prompt,
});Every LLM call resolves through this chain, from highest to lowest priority:
Job/Request options override
↓
DB LLM connection (org-scoped, encrypted credentials)
↓
LLM string / model slug (from job config, URL, or default)
↓
ENV API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
↓
AI SDK provider instance (openai(), anthropic(), google(), etc.)
↓
streamText / generateText / generateObject / streamObject
// src/lib/llm/resolver.ts
import { parse, validate } from "llm-strings";
import { openai, createOpenAI } from "@ai-sdk/openai";
import { anthropic } from "@ai-sdk/anthropic";
import { google } from "@ai-sdk/google";
export function resolveModelInstance(llmString: string) {
// Simple slug — use provider detection
if (!llmString.startsWith("llm://")) {
return detectAndCreate(llmString);
}
const config = parse(llmString);
// config.model, config.params.temp, config.params.maxTokens, etc.
// Options from URI params (passed as AI SDK settings on the call site)
const model = detectAndCreate(config.model, config.params.apiKey);
return { model, params: config.params };
}
function detectAndCreate(modelId: string, apiKey?: string) {
const opts = apiKey ? { apiKey } : {};
if (modelId.startsWith("gpt") || modelId.includes("openai")) return openai(modelId, opts);
if (modelId.startsWith("claude")) return anthropic(modelId, opts);
if (modelId.startsWith("gemini")) return google(modelId, opts);
if (modelId.startsWith("openai/")) return openai(modelId.slice(7), opts);
if (modelId.startsWith("anthropic/")) return anthropic(modelId.slice(10), opts);
if (modelId.startsWith("google/")) return google(modelId.slice(7), opts);
if (modelId.startsWith("local/")) return createOpenAI({ baseURL: "http://localhost:1234/v1", apiKey: "lm-studio" })(modelId.slice(6));
return openai(modelId, opts); // fallback
}Replace scattered openai("gpt-4o"), anthropic("claude-3") calls with a single resolver:
// Before: scattered provider construction
const model = process.env.PROVIDER === "openai"
? openai(process.env.OPENAI_MODEL ?? "gpt-4o")
: anthropic(process.env.ANTHROPIC_MODEL ?? "claude-3");
const result = await generateText({
model,
temperature: parseFloat(process.env.TEMPERATURE ?? "0.7"),
maxTokens: parseInt(process.env.MAX_TOKENS ?? "4096"),
prompt,
});
// After: single resolver + LLM string
const llmString = process.env.LLM_STRING ?? "gpt-5-mini"; // or "llm://api.openai.com/gpt-5-mini?temp=0.7"
const { model, params } = resolveModelInstance(llmString);
const result = await generateText({
model,
temperature: params?.temp,
maxTokens: params?.maxTokens,
prompt,
});Options stored in DB JSONB (or passed as request params) override defaults:
// In a job queue handler:
async function processJob(job: { primaryModel?: string; options: JobOptions }) {
const llmString = job.options.textExtraction
?? job.primaryModel
?? process.env.LLM_STRING
?? "gpt-5-mini";
const { model, params } = resolveModelInstance(llmString);
const result = await streamText({
model,
temperature: params?.temp ?? 0.2,
maxTokens: params?.maxTokens ?? 4096,
messages: [...],
});
}For multi-tenant apps where each org has its own API key:
// DB schema (Drizzle example)
export const llmConnections = pgTable("llm_connections", {
id: text("id").primaryKey(),
organizationId: text("organization_id").notNull(),
name: text("name").notNull(),
llmString: text("llm_string").notNull(), // WITHOUT apiKey: "llm://api.openai.com/gpt-5-mini?temp=0.3"
encryptedCredentials: text("encrypted_credentials"), // AES-256-GCM encrypted API key
isActive: boolean("is_active").default(true),
lastUsedAt: timestamp("last_used_at"),
});
// Resolution with DB connection
async function resolveJobModel(job: { primaryModel?: string; llmConnectionId?: string }): Promise<LanguageModel> {
let llmString = job.primaryModel ?? "gpt-5-mini";
if (job.llmConnectionId) {
const [conn] = await db.select().from(llmConnections)
.where(and(eq(llmConnections.id, job.llmConnectionId), eq(llmConnections.isActive, true)))
.limit(1);
if (conn) {
llmString = conn.llmString;
if (conn.encryptedCredentials) {
const apiKey = decrypt(conn.encryptedCredentials); // AES-256-GCM decrypt
// Inject apiKey into URI: "llm://...?apiKey=sk-..."
llmString = llmString.includes("?")
? `${llmString}&apiKey=${apiKey}`
: `${llmString}?apiKey=${apiKey}`;
}
// Fire-and-forget: update lastUsedAt
db.update(llmConnections).set({ lastUsedAt: new Date() })
.where(eq(llmConnections.id, conn.id)).catch(() => {});
}
}
const { model } = resolveModelInstance(llmString);
return model;
}Different pipeline stages can use different models via the same JSONB options:
// Job options: all optional, each feature can specify its own model
interface JobOptions {
primaryModel?: string; // Main extraction model
summarization?: string | boolean; // "claude-sonnet-4-5" or true (use primary) or false (skip)
graphRAG?: string | boolean;
routingModelSimple?: string; // Fast model for simple pages
routingModelComplex?: string; // Powerful model for complex pages
secondPassModel?: string; // Model for verification pass
}
// In the summarize stage:
function getSummarizationModel(job: { primaryModel?: string; options: JobOptions }): string {
const opt = job.options.summarization;
if (!opt || opt === false) throw new Error("Summarization disabled");
if (typeof opt === "string") return opt; // Feature-specific override
return job.primaryModel ?? "gpt-5-mini"; // Fall back to primary
}// Parse comma-separated model list for ensemble extraction
function parseModelList(primaryModel: string = "gpt-5-mini"): {
extractionModels: string[];
mergeModel: string;
} {
const models = primaryModel.split(",").map(m => m.trim()).filter(Boolean);
return {
extractionModels: models.length ? models : ["gpt-5-mini"],
mergeModel: models[0] ?? "gpt-5-mini",
};
}
// Usage: "gpt-5-mini,claude-sonnet-4-5,gemini-2.5-flash"
// → runs all 3 in parallel, then merges with mergeModel (first one)Always pass params from the parsed URI through to the AI SDK call:
async function callLLM(llmString: string, prompt: string) {
const { model, params } = resolveModelInstance(llmString);
// params come from ?temp=0.7&maxTokens=4000 in the URI
return generateText({
model,
prompt,
temperature: params?.temp !== undefined ? Number(params.temp) : undefined,
maxTokens: params?.maxTokens !== undefined ? Number(params.maxTokens) : undefined,
topP: params?.topP !== undefined ? Number(params.topP) : undefined,
seed: params?.seed !== undefined ? Number(params.seed) : undefined,
});
}// src/lib/encryption.ts
import { createCipheriv, createDecipheriv, randomBytes } from "crypto";
const KEY = Buffer.from(process.env.LLM_CREDENTIALS_KEY!, "hex"); // 32-byte key
export function encrypt(plaintext: string): string {
const iv = randomBytes(12);
const cipher = createCipheriv("aes-256-gcm", KEY, iv);
const encrypted = Buffer.concat([cipher.update(plaintext, "utf8"), cipher.final()]);
const tag = cipher.getAuthTag();
return [iv.toString("hex"), encrypted.toString("hex"), tag.toString("hex")].join(":");
}
export function decrypt(ciphertext: string): string {
const [ivHex, encHex, tagHex] = ciphertext.split(":");
const decipher = createDecipheriv("aes-256-gcm", KEY, Buffer.from(ivHex, "hex"));
decipher.setAuthTag(Buffer.from(tagHex, "hex"));
return decipher.update(Buffer.from(encHex, "hex")) + decipher.final("utf8");
}
export function reassembleWithCredentials(llmString: string, apiKey: string): string {
const sep = llmString.includes("?") ? "&" : "?";
return `${llmString}${sep}apiKey=${encodeURIComponent(apiKey)}`;
}Before implementing, search the codebase for patterns to consolidate:
# Grep for common anti-patterns
grep -rn "process.env.OPENAI_MODEL\|process.env.ANTHROPIC_MODEL\|process.env.GOOGLE_MODEL" src/
grep -rn "openai(\"gpt\|anthropic(\"claude\|google(\"gemini" src/ --include="*.ts"
grep -rn "temperature:\|maxTokens:\|max_tokens:" src/ --include="*.ts"
grep -rn "generateText\|streamText\|generateObject\|streamObject" src/ --include="*.ts"
grep -rn "OPENAI_API_KEY\|ANTHROPIC_API_KEY\|GOOGLE_API_KEY" src/ --include="*.ts"Red flags to consolidate:
- Multiple
process.env.*MODELvars → oneLLM_STRINGor per-feature LLM string - Provider constructed inline at call site → use resolver
- Temperature/maxTokens in ENV → embed in
llm://...?temp=X&maxTokens=Y if provider === "openai" ... else if provider === "anthropic"branches → resolver handles this
OPENAI_MODEL=gpt-4o
OPENAI_TEMPERATURE=0.7
OPENAI_MAX_TOKENS=4096
ANTHROPIC_MODEL=claude-3-sonnet
SUMMARY_MODEL=claude-3-haikuLLM_STRING=llm://api.openai.com/gpt-5-mini?temp=0.7&maxTokens=4096
LLM_SUMMARY_STRING=llm://api.anthropic.com/claude-haiku-4-5?temp=0.3
LLM_CREDENTIALS_KEY=<32-byte hex for encrypted org connections>
# Provider API keys remain as-is (used as fallback when no apiKey in URI)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_GENERATIVE_AI_API_KEY=...Two tables cover most use cases:
export const llmConfigs = pgTable("llm_configs", {
id: text("id").primaryKey().$defaultFn(ulid),
userId: text("user_id").notNull().references(() => users.id),
name: text("name").notNull(), // "Fast & Cheap", "High Quality"
llmString: text("llm_string").notNull(), // "llm://api.openai.com/gpt-5-mini?temp=0.3"
description: text("description"),
isDefault: boolean("is_default").default(false),
createdAt: timestamp("created_at").defaultNow(),
});export const llmConnections = pgTable("llm_connections", {
id: text("id").primaryKey().$defaultFn(ulid),
organizationId: text("organization_id").notNull().references(() => organizations.id),
name: text("name").notNull(),
llmString: text("llm_string").notNull(), // WITHOUT apiKey in URI
encryptedCredentials: text("encrypted_credentials"), // AES-256-GCM
isActive: boolean("is_active").default(true),
isDefault: boolean("is_default").default(false),
lastUsedAt: timestamp("last_used_at"),
lastFailAt: timestamp("last_fail_at"),
lastFailError: text("last_fail_error"),
createdAt: timestamp("created_at").defaultNow(),
updatedAt: timestamp("updated_at").defaultNow(),
});When a user uploads/creates a job, they should be able to:
- Pick from built-in model slugs (no API key needed if ENV key set)
- Pick from saved org connections (pre-configured API keys)
- Create a new connection inline (name + API key → encrypted → saved to org)
// ModelSelector component
type ModelOption =
| { type: "builtin"; model: string; label: string }
| { type: "connection"; connectionId: string; label: string; model: string };
function ModelSelector({ value, onChange, connections }: {
value: string;
onChange: (model: string, connectionId?: string) => void;
connections: OrgConnection[];
}) {
return (
<select onChange={e => {
const opt = JSON.parse(e.target.value);
if (opt.type === "builtin") onChange(opt.model);
else onChange(opt.model, opt.connectionId);
}}>
<optgroup label="Built-in Models">
{BUILTIN_MODELS.map(m => (
<option key={m.id} value={JSON.stringify({ type: "builtin", model: m.id })}>
{m.label} ({m.provider})
</option>
))}
</optgroup>
{connections.length > 0 && (
<optgroup label="Org Connections">
{connections.map(c => (
<option key={c.id} value={JSON.stringify({ type: "connection", connectionId: c.id, model: c.model })}>
{c.name} ({c.model})
</option>
))}
</optgroup>
)}
</select>
);
}After every AI SDK call, extract usage and calculate cost:
// Extract from AI SDK result (handles v4 and v5 field naming differences)
const usage = await result.usage;
const promptTokens = (usage as any).inputTokens ?? (usage as any).promptTokens ?? 0;
const completionTokens= (usage as any).outputTokens ?? (usage as any).completionTokens ?? 0;
const cachedTokens = (usage as any).cachedInputTokens ?? (usage as any).cacheReadInputTokens ?? 0;
const totalTokens = usage.totalTokens ?? promptTokens + completionTokens;
const modelName = (result as any).modelId ?? llmString;
const estimatedCost = calculateCost(modelName, promptTokens, completionTokens, cachedTokens);async function callWithRetry(llmString: string, fn: (model: LanguageModel) => Promise<any>, maxRetries = 2) {
const { model } = resolveModelInstance(llmString);
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
return await fn(model);
} catch (err: any) {
const is429 = err?.status === 429 || err?.message?.includes("rate limit");
if (!is429 || attempt === maxRetries) throw err;
// Parse "retry in Xs" from error message
const retryMatch = err?.message?.match(/retry in (\d+)s/i);
const delay = retryMatch ? parseInt(retryMatch[1]) * 1000 : 5000 * (attempt + 1);
console.warn(`[LLM] Rate limited, retrying in ${delay}ms...`);
await new Promise(r => setTimeout(r, delay));
}
}
}pnpm add llm-strings @ai-sdk/openai @ai-sdk/anthropic @ai-sdk/google aiMinimum files to create:
src/lib/llm/resolver.ts—resolveModelInstance(llmString)→{ model, params }src/lib/encryption.ts—encrypt()/decrypt()/reassembleWithCredentials()(if using DB connections)src/lib/llm/pricing.ts— cost calculation table keyed by model namesrc/db/schema.tsadditions —llmConnectionsand/orllmConfigstables
When asked to implement LLM strings in a project:
- Audit — Run the grep checklist above. List all scattered LLM config locations.
- Install —
pnpm add llm-stringsif not already present. - Write resolver —
src/lib/llm/resolver.tswithresolveModelInstance(). - Write encryption —
src/lib/encryption.tsif org connections needed. - Update DB schema — Add
llmConnections/llmConfigstables. Generate migration. - Refactor call sites — Replace inline provider construction with resolver.
- Update ENV — Replace multi-var config with
LLM_STRING(+LLM_CREDENTIALS_KEYif using DB). - Add API endpoints — CRUD for connections + test endpoint.
- Build frontend selector — Model/connection picker with inline create.
- Test end-to-end — Create connection, submit job, verify resolution chain logs.
After each major step, log the resolved model name and any options applied so the chain is observable.
| Mistake | Fix |
|---|---|
Storing apiKey in plain text in DB |
Always encrypt with AES-256-GCM; only store encrypted blob |
Storing apiKey in the llmString column |
Store URI without apiKey; inject at resolution time |
Ignoring params from parsed URI |
Pass temp, maxTokens etc. through to the AI SDK call |
| Multiple provider env vars → multiple code branches | Resolver handles all providers; ENV keys are fallback only |
Passing model name string to calculateCost() without normalizing |
Strip provider prefix first: "openai/gpt-5-mini" → "gpt-5-mini" |
Forgetting result.usage is a Promise in streaming calls |
Always await result.usage after consuming the stream |
Hardcoding temperature at call site when URI has ?temp= |
Read params.temp from resolver result first |