When building AI-powered applications that generate images or process intensive tasks, rate limiting becomes crucial for preventing abuse and managing costs. Traditional rate limiting solutions face several challenges in serverless environments:
Stateless Nature: Serverless functions are ephemeral and cannot maintain state between requests, making it difficult to track request counts across multiple function invocations.
Distributed Environment: Requests are distributed across multiple edge locations and instances, requiring a centralized state management solution.
Persistence Requirements: Rate limiting data must persist beyond individual function lifecycles to maintain accurate counts and time windows.
Cloudflare Durable Objects provide an elegant solution by offering stateful, persistent compute primitives that run at the edge. Each Durable Object instance maintains its own state and can handle concurrent requests while ensuring data consistency.
Our rate limiter implements a cycle-based approach where each user gets a specific number of requests within a time window. When the limit is exceeded, subsequent requests are blocked until the cycle resets.
Here's a complete implementation of a distributed rate limiter using Durable Objects:
import { DurableObject } from "cloudflare:workers";
interface ThrottleState {
limitTimes: number;
limitEndTimeMs: number;
executedTimesCurrentCycle: number;
currentCycle: number;
}
export interface TryApplyOptions {
limitCycleExecutionTimes: number;
limitCycleTimeMs: number;
}
export interface ThrottlerResponse {
granted: boolean;
state: ThrottleState;
}
export class ThrottlerDO extends DurableObject {
limitCycleExecutionTimes = 10; // Default: 10 requests per cycle
limitCycleTimeMs = 10 * 60 * 1000; // Default: 10 minutes
constructor(ctx: DurableObjectState, env: Env) {
super(ctx, env);
}
// Check current state without modifying it
async getState(): Promise<ThrottlerResponse> {
let state = await this.ctx.storage.get('throttle_state') as ThrottleState | null;
if (!state) {
state = {
limitTimes: 0,
limitEndTimeMs: 0,
executedTimesCurrentCycle: 0,
currentCycle: 0,
};
}
const currentMs = Date.now();
// Reset state if cycle has expired
if (state.limitEndTimeMs > 0 && currentMs > state.limitEndTimeMs) {
state = {
...state,
limitEndTimeMs: 0,
executedTimesCurrentCycle: 0,
};
}
const granted = state.executedTimesCurrentCycle < this.limitCycleExecutionTimes;
return { granted, state };
}
// Attempt to acquire permission for execution
async tryApply(options?: TryApplyOptions): Promise<ThrottlerResponse> {
if (options) {
this.limitCycleExecutionTimes = options.limitCycleExecutionTimes;
this.limitCycleTimeMs = options.limitCycleTimeMs;
}
let granted = false;
let state = await this.ctx.storage.get('throttle_state') as ThrottleState | null;
if (!state) {
state = {
limitTimes: 0,
limitEndTimeMs: 0,
executedTimesCurrentCycle: 0,
currentCycle: 0,
};
}
const currentMs = Date.now();
// Reset cycle if expired
if (state.limitEndTimeMs > 0 && currentMs > state.limitEndTimeMs) {
state.limitEndTimeMs = 0;
state.executedTimesCurrentCycle = 0;
}
// Check if request can be granted
if (state.executedTimesCurrentCycle < this.limitCycleExecutionTimes) {
state.executedTimesCurrentCycle++;
granted = true;
} else {
state.limitTimes++;
granted = false;
}
// Initialize new cycle if needed
if (state.limitEndTimeMs === 0) {
state.limitEndTimeMs = currentMs + this.limitCycleTimeMs;
state.currentCycle++;
// Prevent overflow
if (state.currentCycle >= 65535) {
state.currentCycle = 1;
}
}
// Persist state
await this.ctx.storage.put('throttle_state', state);
return { granted, state };
}
}// Worker that uses the rate limiter
export default {
async fetch(request: Request, env: Env) {
// Create Durable Object instance based on user identifier
const userId = request.headers.get('user-id') || 'anonymous';
const id = env.THROTTLER.idFromName(userId);
const throttler = env.THROTTLER.get(id);
// Check rate limit
const result = await throttler.tryApply({
limitCycleExecutionTimes: 5,
limitCycleTimeMs: 60 * 1000, // 1 minute
});
if (!result.granted) {
return new Response('Rate limit exceeded', { status: 429 });
}
// Process the actual request
return new Response('Request processed successfully');
}
};This implementation provides several key benefits:
Persistent State: Each user's rate limit data persists across requests and deployments.
Edge Distribution: Durable Objects run close to users, reducing latency while maintaining consistency.
Flexible Configuration: Rate limits can be adjusted per request based on user type or subscription level.
Automatic Cleanup: Expired cycles are automatically reset, preventing memory bloat.
The solution scales automatically with Cloudflare's global network while providing precise rate limiting controls essential for production AI applications.
This rate limiting implementation has been successfully deployed in Fastjrsy, an AI-powered jersey design generator that processes thousands of image generation requests daily while maintaining fair usage across users.