This comprehensive SPARC specification provides a production-ready blueprint for building a high-performance synthetic data generator in TypeScript, optimized for low latency as the primary metric. The system leverages both Gemini models and OpenRouter for intelligent routing, supporting 7+ data domains with streaming architecture.
Key Performance Targets:
- P99 latency: < 100ms per record
- Throughput: 4,000-10,000 records/minute
- Cost: $0.000022 per record (using Batch API + context caching)
- Memory: Constant usage (streaming architecture)
Build a production-grade synthetic data generator that streams realistic data across multiple domains (stocks, media, advertising, purchases, medical records, IoT sensors) with low latency as the primary performance metric.
Functional Requirements:
- Generate 7+ data types: time-series (stocks, IoT), event-based (purchases), structured (medical, advertising), media content, streaming data
- Schema-driven generation with Zod runtime validation
- Referential integrity across related entities
- Temporal consistency for time-series and events
- Realistic distributions (normal, long-tail, seasonal)
- Batch and streaming modes
Performance Requirements:
- P99 latency < 100ms per record
- P95 latency < 50ms per record
- Throughput: 4,000+ records/minute minimum
- Event loop lag < 10ms
- Constant memory usage via streaming
Technical Stack:
- TypeScript with Node.js 22+
- Gemini API (primary): Flash-Lite (887 tok/s), Flash (217 tok/s), Batch API
- OpenRouter (fallback): Intelligent routing, multiple model support
- Message queues: NATS (< 1.2ms P99), Kafka (> 500MB/s)
- Zod for schema validation
Non-Functional Requirements:
- 99.9% uptime with automatic fallbacks
- Horizontal scaling via clustering
- Full observability (Prometheus, OpenTelemetry)
- Security: Zero data retention, schema validation, no PII leakage
CLASS SyntheticDataGenerator<T>
PROPERTIES:
schema: ZodSchema<T>
geminiClient: GoogleGenAI
openRouterClient: OpenRouter
redisCache: RedisClient
config: GeneratorConfig
contextCacheKey: string
CONSTRUCTOR(schema, config):
this.schema = schema
this.config = DEFAULT_CONFIG.merge(config)
this.geminiClient = NEW GoogleGenAI({ apiKey: ENV.GEMINI_API_KEY })
this.openRouterClient = NEW OpenRouter({ apiKey: ENV.OPENROUTER_API_KEY })
this.redisCache = AWAIT createRedisClient()
this.contextCacheKey = AWAIT this.initializeContextCache()
ASYNC GENERATOR generate(count, options):
IF count > 1000 AND options.allowBatch:
YIELD FROM this.generateBatch(count, options)
ELSE:
YIELD FROM this.generateStreaming(count, options)
ASYNC GENERATOR generateStreaming(count, options):
queue = []
concurrencyLimit = options.concurrency // 100
FOR i = 0 TO count - 1:
promise = this.generateOne(i, options)
queue.push(promise)
IF queue.length >= concurrencyLimit:
result = AWAIT queue.shift()
validated = this.schema.parse(result)
YIELD validated
WHILE queue.length > 0:
result = AWAIT queue.shift()
validated = this.schema.parse(result)
YIELD validated
ASYNC FUNCTION generateOne(index, options):
TRY:
// Primary: Gemini Flash-Lite (fastest)
response = AWAIT geminiClient.models.generateContent({
model: "gemini-2.5-flash-lite",
cachedContent: this.contextCacheKey,
contents: this.buildRecordPrompt(index),
config: {
responseMimeType: "application/json",
responseJsonSchema: zodToJsonSchema(this.schema),
thinkingConfig: { budget: 0 } // Max speed
}
})
RETURN JSON.parse(response.text)
CATCH RateLimitError:
// Fallback: OpenRouter
response = AWAIT openRouterClient.chat.send({
model: "openai/gpt-4o:nitro",
response_format: { type: "json_object" },
provider: { sort: "latency", allow_fallbacks: true },
models: ["anthropic/claude-3.5-sonnet:nitro", "google/gemini-pro"]
})
RETURN JSON.parse(response.choices[0].message.content)
Time-Series Generator (Stocks, IoT):
CLASS TimeSeriesGenerator
FUNCTION generateSequence(length, startTime, frequency):
trend = generateTrend(length)
seasonal = generateSeasonality(length, frequency)
noise = generateNoise(length) // GARCH model
FOR i = 0 TO length:
YIELD {
timestamp: startTime + (i * frequency),
value: trend[i] + seasonal[i] + noise[i]
}
Event-Based Generator (Purchases):
CLASS EventGenerator
FUNCTION generateEventStream(duration, avgRate):
currentTime = NOW()
WHILE currentTime < duration:
interarrivalTime = EXPONENTIAL(1 / avgRate)
currentTime += interarrivalTime
YIELD {
id: UUID(),
timestamp: currentTime,
userId: maintainReferentialIntegrity("users"),
type: SAMPLE(eventTypes, probabilities)
}
┌─────────────────────────────────────────────────────┐
│ Client Applications │
│ (Web UI, API Clients, ML Pipelines) │
└────────────────────┬────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────┐
│ API Gateway (Express) │
│ REST + WebSocket | Auth | Rate Limiting │
└────────────────────┬────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────┐
│ Core Generation Engine (Node.js Cluster) │
│ ┌────────────────────────────────────────────┐ │
│ │ SyntheticDataGenerator │ │
│ │ - Model Router - Cache Manager │ │
│ │ - Stream Orchestrator - Domain Generators │ │
│ └────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────┐ │
│ │ Worker Pool (CPU-Intensive Tasks) │ │
│ │ [Worker 1] [Worker 2] ... [Worker N] │ │
│ └────────────────────────────────────────────┘ │
└─────────┬─────────────────────┬────────────────────┘
│ │
┌─────────▼──────┐ ┌────────▼────────┐
│ Model APIs │ │ Message Queue │
│ - Gemini │ │ - NATS (latency)│
│ - OpenRouter │ │ - Kafka (volume)│
└────────────────┘ └─────────────────┘
1. API Gateway:
- Express.js with WebSocket support
- Rate limiting via Redis
- REST endpoints + streaming WebSocket
- Request validation with Zod
2. Core Generator:
- Async generator-based streaming
- Intelligent model routing (Gemini primary, OpenRouter fallback)
- Context caching (90% cost savings)
- Parallel request processing (100 concurrent)
3. Domain Generators:
- TimeSeriesGenerator: GARCH models, seasonality
- EventGenerator: Poisson processes, referential integrity
- StructuredDataGenerator: Schema-driven with compliance
4. Message Queue Manager:
- NATS for low-latency (< 1.2ms P99)
- Kafka for high-volume (> 500MB/s)
- Automatic routing based on batch size
Core: Node.js 22+, TypeScript 5.3+, esbuild
APIs: @google/genai, @openrouter/sdk
Validation: Zod, zod-to-json-schema
Queues: NATS, Kafka (kafkajs)
Cache: Redis
Testing: Vitest, fast-check (property-based)
Observability: Prometheus, OpenTelemetry
1. Context Caching:
- Gemini context caching: 90% input token savings
- Redis prompt caching: Sub-ms retrieval
- Cost: $0.03/M cached tokens vs $0.30/M regular
- Break-even: 3+ requests per context
2. Model Routing Strategy:
// Ultra-low latency: Gemini Flash-Lite (887 tok/s)
if (latencyRequirement < 50ms) {
use 'gemini-2.5-flash-lite' with thinkingConfig: { budget: 0 }
}
// Balanced: Gemini Flash (217 tok/s)
if (latencyRequirement < 200ms) {
use 'gemini-2.5-flash' with thinkingConfig: { budget: 1000 }
}
// High volume: Batch API (50% cost savings)
if (count > 1000) {
use Batch API with 'gemini-2.5-flash-lite'
}3. Connection Pooling:
- HTTP keep-alive: 71% performance improvement
- Database pool: Max 20 connections, min 5 idle
- Redis connection reuse
4. Worker Threads:
- CPU-intensive tasks in worker pool
- Pool size: CPU count (typically 8)
- Parallelize statistical generation
5. Memory Management:
- Object pooling for buffers
- Streaming architecture (constant memory)
- GC monitoring with alerts on > 100ms pauses
6. Rate Limit Management:
- Token bucket algorithm
- Circuit breakers for automatic fallback
- Exponential backoff on rate limits
1. Property-Based Testing (fast-check):
fc.assert(
fc.asyncProperty(
fc.integer({ min: 1, max: 1000 }),
async (count) => {
const generator = new SyntheticDataGenerator(schema);
for await (const item of generator.generate(count)) {
expect(schema.safeParse(item).success).toBe(true);
}
}
)
);2. Performance Benchmarks:
- P99 latency < 100ms
- Throughput > 4,000 records/minute
- Memory growth < 50MB per 100K records
- Event loop lag < 10ms
3. Load Testing (k6):
- Ramp up to 100 concurrent users
- 5-minute sustained load
- P99 < 200ms threshold
- Error rate < 1%
4. Integration Tests:
- End-to-end generation pipeline
- Message queue integration
- Schema validation
- Error handling and fallbacks
Prometheus Metrics:
synthetic_data_generated_total(counter)synthetic_data_generation_duration_seconds(histogram)event_loop_lag_seconds(gauge)model_api_errors_total(counter)
OpenTelemetry Tracing:
- Distributed traces across API → Generator → Models
- Span events for record generation
- Performance bottleneck identification
Dashboards:
- Real-time throughput and latency
- Error rates by provider
- Cache hit rates
- Resource utilization (CPU, memory, event loop)
Pre-Production:
- ✅ All tests passing (unit, integration, E2E, load)
- ✅ TypeScript strict mode enabled
- ✅ Performance benchmarks met (P99 < 100ms)
- ✅ Memory leak testing completed
- ✅ Security audit (dependencies, API keys, validation)
- ✅ Documentation complete
- ✅ Monitoring dashboards configured
- ✅ CI/CD pipeline validated
Environment Variables:
GEMINI_API_KEY=<key>
OPENROUTER_API_KEY=<key>
REDIS_URL=redis://localhost:6379
KAFKA_BROKERS=kafka1:9092,kafka2:9092
NATS_URL=nats://localhost:4222
DATABASE_URL=postgresql://localhost:5432/db
NODE_ENV=productionDocker Deployment:
FROM node:22-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY dist ./dist
USER node
EXPOSE 3000 9090
CMD ["node", "--max-old-space-size=4096", "dist/index.js"]Kubernetes Deployment:
- 3 replicas for high availability
- HPA: Scale 3-10 pods based on CPU (70%) and memory (80%)
- Health checks: /health, /ready endpoints
- Resource limits: 2Gi memory, 1 CPU per pod
- Load balancer service on port 80
REST API:
POST /api/v1/generate
Request:
{
"schema": ZodSchema,
"count": number,
"options": {
"domain": "stocks" | "medical" | "iot" | "purchase" | "media",
"streaming": boolean,
"model": "gemini" | "openrouter" | "auto"
}
}
Response (streaming): NDJSON stream
Response (batch): { jobId, status, estimatedCompletion }
WebSocket API:
ws://host/api/v1/stream
Send: { type: "generate", schema, count, options }
Receive: { type: "data", record } | { type: "complete" }
Phase 1 (Weeks 1-2): Foundation
- Core TypeScript project setup
- Zod schema validation system
- Gemini API integration (streaming, batch, caching)
- OpenRouter integration and fallback logic
Phase 2 (Weeks 3-4): Streaming Architecture
- Async generator implementation
- Worker thread pool
- Connection pooling
- NATS and Kafka integration
Phase 3 (Weeks 5-6): Domain Generators
- Time-series generator (stocks, IoT)
- Event-based generator (purchases)
- Structured data generator (medical, advertising)
- Referential integrity system
Phase 4 (Week 7): API Layer
- Express REST API
- WebSocket streaming
- Authentication and rate limiting
Phase 5 (Week 8): Optimization
- Context caching implementation
- Model routing optimization
- Performance tuning
Phase 6 (Week 9): Observability
- Prometheus metrics
- OpenTelemetry tracing
- Monitoring dashboards
Phase 7 (Week 10): Production Readiness
- Docker containerization
- Kubernetes deployment
- CI/CD pipeline
- Production launch
Monthly Cost (30M records):
Gemini API:
- Flash-Lite Batch: $6.75/month for 1M records/day
- Context caching: 90% input token savings
OpenRouter Fallback (5%):
- $50/month for 1.5M fallback records
Infrastructure:
- Node.js servers (3× t3.large): $150/month
- Redis cluster: $50/month
- Kafka cluster: $300/month
- PostgreSQL: $100/month
- Total: $600/month
Grand Total: ~$657/month for 30M records Per Record Cost: $0.000022
Performance:
- ✅ P99 latency < 100ms
- ✅ Throughput > 4,000 records/minute
- ✅ Event loop lag < 10ms
- ✅ Memory growth < 50MB per 100K records
Reliability:
- ✅ Uptime > 99.9%
- ✅ Error rate < 0.1%
- ✅ Successful fallback rate > 95%
Quality:
- ✅ Schema validation pass rate: 100%
- ✅ Referential integrity maintained: 100%
- ✅ Test coverage > 90%
Cost:
- ✅ Per-record cost < $0.0005
- ✅ Budget variance < 10%
// src/index.ts
import { z } from 'zod';
import { SyntheticDataGenerator } from './core/generator';
// Define schema
const IoTSensorSchema = z.object({
id: z.string().uuid(),
timestamp: z.number(),
temperature: z.number().min(-50).max(150),
humidity: z.number().min(0).max(100),
location: z.object({ lat: z.number(), lng: z.number() })
});
// Initialize generator
const generator = new SyntheticDataGenerator(IoTSensorSchema, {
concurrency: 100,
primaryModel: 'gemini',
fallbackEnabled: true
});
// Generate data
async function main() {
console.log('Generating 10,000 IoT sensor readings...');
const start = Date.now();
let count = 0;
for await (const record of generator.generate(10000)) {
count++;
if (count % 1000 === 0) {
console.log(`Generated ${count} records`);
}
}
const duration = Date.now() - start;
console.log(`Complete! Generated ${count} records in ${duration}ms`);
console.log(`Throughput: ${(count / duration * 1000).toFixed(0)} records/sec`);
}
main();This SPARC specification provides a complete, production-ready blueprint for building a high-performance synthetic data generator optimized for low latency. Key achievements:
Performance: P99 < 100ms, 4,000+ records/minute, constant memory Cost: $0.000022 per record using Batch API + context caching Architecture: TypeScript streaming with Gemini + OpenRouter, NATS/Kafka queues Domains: 7+ data types with realistic patterns Production-Ready: Full testing, monitoring, deployment configs, 10-week roadmap
The system leverages Gemini 2.5 Flash-Lite (887 tok/s) for maximum speed, OpenRouter for intelligent fallbacks, and modern TypeScript patterns (async generators, Zod validation, worker threads) to achieve production-grade performance at minimal cost.