Skip to content

Instantly share code, notes, and snippets.

@yahyafakhroji
Created December 4, 2025 06:05
Show Gist options
  • Select an option

  • Save yahyafakhroji/79aee0cd73f9349c7dfb5ef976a46801 to your computer and use it in GitHub Desktop.

Select an option

Save yahyafakhroji/79aee0cd73f9349c7dfb5ef976a46801 to your computer and use it in GitHub Desktop.
Cloud Portal v3
# Cloud Portal Architecture V3: Hybrid Domain-Driven Architecture
## Summary
This proposal introduces a comprehensive architectural refactoring of the Cloud Portal application, transitioning from the current Express-based BFF architecture to a modern Hono + React Query hybrid architecture. The new architecture separates concerns into a domain-driven data layer and a scalable UI layer, while replacing the current server infrastructure with a lightweight, type-safe Hono server.
**Status:** Proposal
**Stage:** Beta
**Version:** 3.0.0
---
## Table of Contents
- [Summary](#summary)
- [Motivation](#motivation)
- [Current Pain Points](#current-pain-points)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [Architecture Overview](#architecture-overview)
- [User Stories](#user-stories)
- [Domain Structure Pattern](#domain-structure-pattern)
- [New Modules (extending app/modules/)](#new-modules-extending-appmodules)
- [Notes/Constraints/Caveats](#notesconstraintscaveats)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [Hono Server Structure](#hono-server-structure)
- [Load Context with QueryClient](#load-context-with-queryclient)
- [Query Key Factory Pattern](#query-key-factory-pattern)
- [SSR Prefetch + Hydration Pattern](#ssr-prefetch--hydration-pattern)
- [Mutation with Optimistic Updates](#mutation-with-optimistic-updates)
- [Error Handling & Logging](#error-handling--logging)
- [Architecture Overview](#architecture-overview-1)
- [Error Module](#error-module)
- [Logger Module](#logger-module)
- [Monitoring Module](#monitoring-module)
- [Server Middleware (using modules)](#server-middleware-using-modules)
- [React Query Error Handling](#react-query-error-handling)
- [Client-Side API Error Type](#client-side-api-error-type)
- [Control Class Error Handling Pattern](#control-class-error-handling-pattern)
- [React Error Boundaries](#react-error-boundaries)
- [Error Flow Summary](#error-flow-summary)
- [Comparison: Old vs New Architecture](#comparison-old-vs-new-architecture)
- [Structure Comparison](#structure-comparison)
- [Code Organization](#code-organization)
- [Pros and Cons](#pros-and-cons)
- [Benchmarks](#benchmarks)
- [Bundle Size Comparison](#bundle-size-comparison)
- [Performance Metrics (Expected)](#performance-metrics-expected)
- [Developer Experience Metrics](#developer-experience-metrics)
- [What Makes This Architecture Interesting](#what-makes-this-architecture-interesting)
- [Production Readiness Review](#production-readiness-review)
- [Feature Enablement and Rollout](#feature-enablement-and-rollout)
- [Monitoring Requirements](#monitoring-requirements)
- [Dependencies](#dependencies)
- [Scalability](#scalability)
- [Environment Variables](#environment-variables)
- [Implementation History](#implementation-history)
- [Drawbacks](#drawbacks)
- [Appendix A: Quick Reference](#appendix-a-quick-reference)
- [Import Cheat Sheet](#import-cheat-sheet)
- [Adding a New Domain Checklist](#adding-a-new-domain-checklist)
- [Appendix B: Production Debugging Guide](#appendix-b-production-debugging-guide)
- [When User Reports an Error](#when-user-reports-an-error)
- [Log Search Patterns](#log-search-patterns)
- [Error Response Format](#error-response-format)
---
## Motivation
### Current Pain Points
The existing Cloud Portal architecture has several limitations that impact developer productivity, application performance, and maintainability:
1. **Fragmented Data Layer**: Data fetching logic is scattered across loaders, components, and API routes without a unified pattern
2. **No Client-Side Caching**: Every navigation triggers server-side data fetching, leading to unnecessary API calls and poor user experience
3. **Express Overhead**: The Express server adds ~150KB to the bundle and lacks native TypeScript support
4. **Coupled UI and Data Logic**: Hooks mixing data fetching with UI concerns make testing and reuse difficult
5. **Inconsistent API Patterns**: Different domains follow different patterns for control classes, types, and API calls
6. **SSR/CSR Mismatch**: No hydration strategy leads to flash of loading states on client-side navigation
7. **Inconsistent Error Handling**: No standardized approach to error handling and logging makes production debugging difficult
### Goals
- **Unified Data Layer**: Establish a single source of truth for all data operations using domain-driven organization
- **Client-Side Caching**: Implement React Query for intelligent caching, background refetching, and optimistic updates
- **Lightweight Server**: Replace Express with Hono for 40-60% smaller bundle size and better performance
- **Clear Separation**: Strict separation between data layer (resources) and UI layer (features/components)
- **Type Safety**: End-to-end type safety from API to UI using Zod schemas and TypeScript
- **SSR Hydration**: Seamless server-to-client data transfer with dehydration/hydration pattern
- **Scalability**: Pattern-based architecture that scales with team size and codebase growth
- **Production Debugging**: Structured error handling and logging for easy debugging in production
### Non-Goals
- Complete rewrite of all existing features (incremental migration)
- Changing the underlying API structure (control plane APIs remain unchanged)
- Modifying authentication/authorization flows
- Restructuring the routing system
---
## Proposal
### Architecture Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ HONO SERVER (NEW) │
│ │
│ app/server/ │
│ ├── entry.ts → Main Hono app entry │
│ ├── context.ts → Load context (QueryClient, API clients) │
│ ├── middleware/ → Auth, security, rate-limit, request-context │
│ └── routes/ → API endpoints (/api/organizations, etc.) │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ DATA LAYER (Domain-Driven) │
│ │
│ app/resources/ │
│ ├── core/ → Query client, options, utilities, error types │
│ ├── organizations/ → types, schemas, keys, api, hooks, control │
│ ├── projects/ → types, schemas, keys, api, hooks, control │
│ ├── dns/ → NESTED: zones/, records/ │
│ ├── iam/ → NESTED: roles/, groups/, policies/ │
│ └── ... → Other domains │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ UI LAYER (Shared & Scalable) │
│ │
│ app/components/ → Shared UI components (buttons, inputs, etc.) │
│ app/features/ → Feature-specific UI (NO hooks, NO data) │
│ app/layouts/ → Page layouts │
│ app/hooks/ → Utility hooks only (useDebounce, etc.) │
└─────────────────────────────────────────────────────────────────────────────┘
```
> [!NOTE]
> The existing `app/modules/` folder continues to be used for reusable infrastructure. New modules for errors, logger, and monitoring will be added there alongside existing modules like `control-plane/`.
### User Stories
#### Story 1: Developer Adding a New Domain
As a developer, I want to add a new resource domain (e.g., "workloads") by following a consistent pattern, so that I can be productive immediately without learning new conventions.
**Acceptance Criteria:**
- Copy existing domain folder structure
- Implement types, schemas, keys, api, hooks, control
- All imports follow established patterns
- Tests can be written in isolation
#### Story 2: Component Developer Using Data
As a component developer, I want to access organization data using simple hooks, so that I don't need to understand the underlying API structure.
**Acceptance Criteria:**
```typescript
// Simple, discoverable API
import { useOrganizations, useCreateOrganization } from '@/resources/organizations';
function MyComponent() {
const { data, isLoading } = useOrganizations();
const createMutation = useCreateOrganization();
// ...
}
```
#### Story 3: SSR with Client Hydration
As a user, I want pages to load with data already rendered (SSR) and then seamlessly transition to client-side interactions without loading flashes.
**Acceptance Criteria:**
- Initial page load shows server-rendered content
- Client hydrates with same data (no refetch)
- Subsequent navigations use cached data when available
- Background refetching updates stale data automatically
#### Story 4: Production Error Debugging
As a developer, I want to quickly trace and debug production errors using a request ID, so that I can identify root causes without extensive log searching.
**Acceptance Criteria:**
- Every request has a unique `requestId` attached
- All logs include request context (requestId, userId, orgId)
- Error responses include requestId for user reporting
- Structured JSON logs in production for easy parsing
- Errors are categorized and typed for appropriate handling
### Domain Structure Pattern
Each domain in `resources/` follows this consistent pattern:
```
resources/{domain}/
├── index.ts # Public exports (barrel file)
├── types.ts # TypeScript interfaces
├── schemas.ts # Zod validation schemas
├── keys.ts # React Query key factory
├── api.ts # Client-side query options
├── api.server.ts # Server-side prefetch (loaders only)
├── hooks.ts # React Query hooks
└── control.ts # API client wrapper (moved from modules/)
```
| File | Purpose | Can Import | Used By |
|------|---------|------------|---------|
| `types.ts` | TypeScript types | Nothing | Everything |
| `schemas.ts` | Zod validation | `types.ts` | Forms, API validation |
| `keys.ts` | Query key factory | Nothing | `api.ts`, `hooks.ts` |
| `api.ts` | Client query options | `keys.ts`, `types.ts` | `hooks.ts` |
| `api.server.ts` | SSR prefetch | `control.ts`, `keys.ts` | Route loaders |
| `hooks.ts` | React Query hooks | `api.ts`, `keys.ts` | Components, routes |
| `control.ts` | API wrapper | `types.ts` | `api.server.ts`, server routes |
### New Modules (extending app/modules/)
The following new modules will be added to the existing `app/modules/` folder:
```
modules/
├── errors/ # Typed error classes (NEW)
│ ├── index.ts
│ ├── base.ts
│ ├── http.ts
│ ├── api.ts
│ └── validation.ts
├── logger/ # Structured logging (NEW)
│ ├── index.ts
│ ├── logger.ts
│ └── request-logger.ts
├── monitoring/ # Error tracking & observability (NEW)
│ ├── index.ts
│ └── sentry.ts
└── control-plane/ # Existing (unchanged)
└── ...
```
### Notes/Constraints/Caveats
> [!NOTE]
> **Server-Only Modules**: Files with `.server.ts` suffix follow React Router's convention and are automatically excluded from client bundles. They must NOT be exported from barrel files (`index.ts`).
> [!NOTE]
> **Control Class Migration**: Control classes are moved from `modules/control-plane/` to their respective domain folders, maintaining the same API but with better colocation.
> [!NOTE]
> **Extending app/modules/**: New infrastructure (errors, logger, monitoring) will be added to the existing `app/modules/` folder to maintain consistency with current patterns.
> [!WARNING]
> **Import Rules**: The architecture enforces strict import rules to maintain separation:
> - `resources/` cannot import from `features/` or `components/`
> - `features/` cannot import from other `features/`
> - `components/` cannot import from `features/`
> - `modules/` should not import from `resources/`, `features/`, or `components/`
### Risks and Mitigations
| Risk | Impact | Mitigation |
|------|--------|------------|
| Large migration scope | High | Incremental migration, domain by domain |
| Breaking changes to imports | Medium | Automated codemods for import updates |
| Learning curve for team | Medium | Comprehensive documentation and examples |
| Performance regression | Low | Benchmarking at each phase |
| Cache invalidation bugs | Medium | Thorough testing of mutation handlers |
| Production debugging complexity | Medium | Structured logging + request tracing |
---
## Design Details
### Hono Server Structure
```typescript
// app/server/entry.ts
import { createHonoServer } from 'react-router-hono-server/node';
import { createLoadContext } from './context';
import { requestContextMiddleware } from './middleware/request-context';
import { requestLoggerMiddleware } from './middleware/request-logger';
import { globalErrorHandler, notFoundHandler } from './middleware/error-handler';
import { logger } from '@/modules/logger';
export default await createHonoServer({
configure: (app) => {
// Register global error handler FIRST
app.onError(globalErrorHandler);
app.notFound(notFoundHandler);
// Core middleware stack (order matters!)
app.use('*', requestContextMiddleware);
app.use('*', requestLoggerMiddleware);
app.use('*', securityMiddleware());
app.use('*', rateLimitMiddleware());
app.use('*', authMiddleware());
// Health checks (no logging overhead)
app.get('/_healthz', (c) => c.json({ status: 'ok', timestamp: Date.now() }));
// API routes
app.route('/api', apiRoutes);
logger.info('Server configured successfully');
},
getLoadContext: (c) => createLoadContext(c),
});
```
### Load Context with QueryClient
```typescript
// app/server/context.ts
import { createQueryClient } from '@/resources/core/query-client';
import type { RequestLogger } from '@/modules/logger';
export interface AppLoadContext {
readonly appVersion: string;
readonly cspNonce: string;
readonly requestId: string;
readonly controlPlaneClient: Client;
readonly iamResourceClient: Client;
readonly session: SessionData | null;
readonly cache: Storage;
readonly queryClient: QueryClient; // NEW: Per-request QueryClient
readonly log: RequestLogger; // NEW: Request-scoped logger
}
export function createLoadContext(c: Context<HonoEnv>): AppLoadContext {
return {
// ... existing context
queryClient: createQueryClient(), // Fresh QueryClient per request
log: c.get('log'), // Request-scoped logger
};
}
```
### Query Key Factory Pattern
```typescript
// app/resources/organizations/keys.ts
export const organizationKeys = {
all: ['organizations'] as const,
lists: () => [...organizationKeys.all, 'list'] as const,
list: (filters?: { search?: string }) =>
[...organizationKeys.lists(), filters] as const,
details: () => [...organizationKeys.all, 'detail'] as const,
detail: (name: string) => [...organizationKeys.details(), name] as const,
};
```
### SSR Prefetch + Hydration Pattern
```typescript
// Route loader (server-side)
export async function loader({ context }: LoaderFunctionArgs) {
const { queryClient, controlPlaneClient, log } = context;
try {
// Prefetch data into queryClient
await prefetchList(queryClient, controlPlaneClient);
} catch (error) {
// Log but don't throw - graceful degradation
log.warn('SSR prefetch failed, will retry on client', {
error: error instanceof Error ? error.message : 'Unknown',
});
}
return {
dehydratedState: dehydrate(queryClient),
};
}
// Component (client-side)
export default function Page() {
// Uses prefetched data, no loading state on initial render
const { data } = useOrganizations();
return <OrganizationList data={data} />;
}
```
### Mutation with Optimistic Updates
```typescript
// app/resources/organizations/hooks.ts
export function useUpdateOrganization() {
const queryClient = useQueryClient();
return useMutation({
mutationFn: ({ name, data }) => api.update(name, data),
// Optimistic update
onMutate: async ({ name, data }) => {
await queryClient.cancelQueries({ queryKey: organizationKeys.detail(name) });
const previous = queryClient.getQueryData(organizationKeys.detail(name));
queryClient.setQueryData(organizationKeys.detail(name), (old) => ({
...old,
...data,
}));
return { previous, name };
},
// Rollback on error
onError: (err, vars, context) => {
if (context?.previous) {
queryClient.setQueryData(
organizationKeys.detail(context.name),
context.previous
);
}
},
// Refetch on settle
onSettled: (_, __, { name }) => {
queryClient.invalidateQueries({ queryKey: organizationKeys.detail(name) });
queryClient.invalidateQueries({ queryKey: organizationKeys.lists() });
},
});
}
```
---
## Error Handling & Logging
### Architecture Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ ERROR HANDLING LAYERS │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ │
│ │ Hono Server │ │ React Query │ │ React Error │ │
│ │ Error Handler │──▶│ Error Boundary │──▶│ Boundaries (UI) │ │
│ │ (app.onError) │ │ (QueryCache) │ │ (Per-feature) │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ CENTRALIZED LOGGER │ │
│ │ app/modules/logger/ │ │
│ │ - Structured JSON logs (production) │ │
│ │ - Pretty console logs (development) │ │
│ │ - Request-scoped context (requestId, userId, orgId) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Error Module
```typescript
// app/modules/errors/base.ts
export interface ErrorContext {
requestId?: string;
userId?: string;
organizationId?: string;
resource?: string;
operation?: string;
metadata?: Record<string, unknown>;
}
export class AppError extends Error {
public readonly code: string;
public readonly statusCode: number;
public readonly isOperational: boolean;
public readonly context: ErrorContext;
public readonly timestamp: string;
public readonly cause?: Error;
constructor(
message: string,
options: {
code: string;
statusCode?: number;
isOperational?: boolean;
context?: ErrorContext;
cause?: Error;
}
) {
super(message);
this.name = this.constructor.name;
this.code = options.code;
this.statusCode = options.statusCode ?? 500;
this.isOperational = options.isOperational ?? true;
this.context = options.context ?? {};
this.timestamp = new Date().toISOString();
this.cause = options.cause;
Error.captureStackTrace(this, this.constructor);
}
// Safe serialization for client (no sensitive data)
toClientResponse() {
return {
error: {
code: this.code,
message: this.isOperational ? this.message : 'An unexpected error occurred',
requestId: this.context.requestId,
},
};
}
// Full serialization for logging (includes everything)
toLogEntry() {
return {
name: this.name,
code: this.code,
message: this.message,
statusCode: this.statusCode,
isOperational: this.isOperational,
context: this.context,
timestamp: this.timestamp,
stack: this.stack,
cause: this.cause?.message,
};
}
}
// app/modules/errors/http.ts
export class NotFoundError extends AppError {
constructor(resource: string, identifier: string, context?: ErrorContext) {
super(`${resource} with identifier '${identifier}' not found`, {
code: 'NOT_FOUND',
statusCode: 404,
context: { ...context, resource },
});
}
}
export class BadRequestError extends AppError {
constructor(message: string, context?: ErrorContext) {
super(message, { code: 'BAD_REQUEST', statusCode: 400, context });
}
}
export class UnauthorizedError extends AppError {
constructor(message = 'Authentication required', context?: ErrorContext) {
super(message, { code: 'UNAUTHORIZED', statusCode: 401, context });
}
}
export class ForbiddenError extends AppError {
constructor(message = 'Access denied', context?: ErrorContext) {
super(message, { code: 'FORBIDDEN', statusCode: 403, context });
}
}
export class ConflictError extends AppError {
constructor(message: string, context?: ErrorContext) {
super(message, { code: 'CONFLICT', statusCode: 409, context });
}
}
export class RateLimitError extends AppError {
constructor(retryAfter?: number, context?: ErrorContext) {
super('Too many requests', {
code: 'RATE_LIMITED',
statusCode: 429,
context: { ...context, metadata: { retryAfter } },
});
}
}
// app/modules/errors/api.ts
export class ControlPlaneError extends AppError {
public readonly grpcCode?: number;
public readonly upstream: string;
constructor(
message: string,
options: {
upstream: string; // e.g., 'organizations-api', 'iam-api'
grpcCode?: number;
statusCode?: number;
context?: ErrorContext;
cause?: Error;
}
) {
super(message, {
code: 'CONTROL_PLANE_ERROR',
statusCode: options.statusCode ?? 502,
context: {
...options.context,
metadata: { upstream: options.upstream, grpcCode: options.grpcCode },
},
cause: options.cause,
});
this.upstream = options.upstream;
this.grpcCode = options.grpcCode;
}
}
export class NetworkError extends AppError {
constructor(message: string, context?: ErrorContext, cause?: Error) {
super(message, { code: 'NETWORK_ERROR', statusCode: 503, context, cause });
}
}
export class TimeoutError extends AppError {
constructor(operation: string, timeout: number, context?: ErrorContext) {
super(`Operation '${operation}' timed out after ${timeout}ms`, {
code: 'TIMEOUT',
statusCode: 504,
context: { ...context, operation, metadata: { timeout } },
});
}
}
// app/modules/errors/validation.ts
import { ZodError, ZodIssue } from 'zod';
export class ValidationError extends AppError {
public readonly issues: ZodIssue[];
constructor(zodError: ZodError, context?: ErrorContext) {
const firstIssue = zodError.issues[0];
const message = `Validation failed: ${firstIssue.path.join('.')} - ${firstIssue.message}`;
super(message, {
code: 'VALIDATION_ERROR',
statusCode: 400,
context: { ...context, metadata: { issues: zodError.issues } },
});
this.issues = zodError.issues;
}
toClientResponse() {
return {
error: {
code: this.code,
message: 'Validation failed',
requestId: this.context.requestId,
details: this.issues.map((issue) => ({
field: issue.path.join('.'),
message: issue.message,
})),
},
};
}
}
// app/modules/errors/index.ts
export * from './base';
export * from './http';
export * from './api';
export * from './validation';
```
### Logger Module
```typescript
// app/modules/logger/logger.ts
export type LogLevel = 'debug' | 'info' | 'warn' | 'error' | 'fatal';
interface LogEntry {
level: LogLevel;
message: string;
timestamp: string;
requestId?: string;
userId?: string;
organizationId?: string;
duration?: number;
[key: string]: unknown;
}
const LOG_LEVELS: Record<LogLevel, number> = {
debug: 0, info: 1, warn: 2, error: 3, fatal: 4,
};
const currentLevel: LogLevel = (process.env.LOG_LEVEL as LogLevel) || 'info';
const isProduction = process.env.NODE_ENV === 'production';
function shouldLog(level: LogLevel): boolean {
return LOG_LEVELS[level] >= LOG_LEVELS[currentLevel];
}
function formatEntry(entry: LogEntry): string {
if (isProduction) {
// Structured JSON for production (easy to parse in logging systems)
return JSON.stringify(entry);
}
// Pretty format for development
const { level, message, timestamp, requestId, duration, ...rest } = entry;
const prefix = `[${timestamp}] ${level.toUpperCase().padEnd(5)}`;
const reqInfo = requestId ? ` [${requestId.slice(0, 8)}]` : '';
const durationInfo = duration !== undefined ? ` (${duration}ms)` : '';
const extra = Object.keys(rest).length > 0 ? `\n ${JSON.stringify(rest, null, 2)}` : '';
return `${prefix}${reqInfo} ${message}${durationInfo}${extra}`;
}
function log(level: LogLevel, message: string, meta?: Record<string, unknown>): void {
if (!shouldLog(level)) return;
const entry: LogEntry = {
level,
message,
timestamp: new Date().toISOString(),
...meta,
};
const output = formatEntry(entry);
switch (level) {
case 'error':
case 'fatal':
console.error(output);
break;
case 'warn':
console.warn(output);
break;
default:
console.log(output);
}
}
// Main logger interface
export const logger = {
debug: (message: string, meta?: Record<string, unknown>) => log('debug', message, meta),
info: (message: string, meta?: Record<string, unknown>) => log('info', message, meta),
warn: (message: string, meta?: Record<string, unknown>) => log('warn', message, meta),
error: (message: string, meta?: Record<string, unknown>) => log('error', message, meta),
fatal: (message: string, meta?: Record<string, unknown>) => log('fatal', message, meta),
};
// app/modules/logger/request-logger.ts
import { Context } from 'hono';
export function createRequestLogger(c: Context) {
const requestId = c.get('requestId') as string | undefined;
const userId = c.get('session')?.user?.id as string | undefined;
const organizationId = c.req.param('orgId') || c.req.query('orgId');
const baseContext = {
requestId,
userId,
organizationId,
path: c.req.path,
method: c.req.method,
};
return {
debug: (message: string, meta?: Record<string, unknown>) =>
log('debug', message, { ...baseContext, ...meta }),
info: (message: string, meta?: Record<string, unknown>) =>
log('info', message, { ...baseContext, ...meta }),
warn: (message: string, meta?: Record<string, unknown>) =>
log('warn', message, { ...baseContext, ...meta }),
error: (message: string, meta?: Record<string, unknown>) =>
log('error', message, { ...baseContext, ...meta }),
};
}
export type RequestLogger = ReturnType<typeof createRequestLogger>;
// app/modules/logger/index.ts
export { logger } from './logger';
export { createRequestLogger, type RequestLogger } from './request-logger';
```
### Monitoring Module
```typescript
// app/modules/monitoring/sentry.ts
import * as Sentry from '@sentry/react';
export function initMonitoring() {
if (process.env.NODE_ENV !== 'production') return;
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.DEPLOYMENT_ENV,
tracesSampleRate: 0.1,
beforeSend(event, hint) {
const error = hint.originalException;
// Filter out operational errors
if (error && typeof error === 'object' && 'isOperational' in error) {
if ((error as { isOperational: boolean }).isOperational) {
return null;
}
}
return event;
},
});
}
export function captureError(error: Error, context?: Record<string, unknown>) {
if (process.env.NODE_ENV !== 'production') {
console.error('[captureError]', error, context);
return;
}
Sentry.captureException(error, { extra: context });
}
export function setUserContext(user: { id: string; email?: string }) {
Sentry.setUser(user);
}
// app/modules/monitoring/index.ts
export { initMonitoring, captureError, setUserContext } from './sentry';
```
### Server Middleware (using modules)
```typescript
// app/server/middleware/request-context.ts
import { createMiddleware } from 'hono/factory';
import { createRequestLogger, type RequestLogger } from '@/modules/logger';
import { nanoid } from 'nanoid';
declare module 'hono' {
interface ContextVariableMap {
requestId: string;
requestStart: number;
log: RequestLogger;
}
}
export const requestContextMiddleware = createMiddleware(async (c, next) => {
// Generate or extract request ID (supports distributed tracing)
const requestId = c.req.header('x-request-id') || nanoid(12);
c.set('requestId', requestId);
c.set('requestStart', Date.now());
// Create request-scoped logger
const log = createRequestLogger(c);
c.set('log', log);
// Set response header for tracing
c.header('x-request-id', requestId);
await next();
});
// app/server/middleware/request-logger.ts
import { createMiddleware } from 'hono/factory';
export const requestLoggerMiddleware = createMiddleware(async (c, next) => {
const log = c.get('log');
const start = c.get('requestStart');
log.info('Request started', {
userAgent: c.req.header('user-agent'),
contentType: c.req.header('content-type'),
});
await next();
const duration = Date.now() - start;
const status = c.res.status;
if (status >= 500) {
log.error('Request failed', { status, duration });
} else if (status >= 400) {
log.warn('Request client error', { status, duration });
} else {
log.info('Request completed', { status, duration });
}
});
// app/server/middleware/error-handler.ts
import { Context } from 'hono';
import { HTTPException } from 'hono/http-exception';
import { ZodError } from 'zod';
import { AppError, ValidationError } from '@/modules/errors';
import { logger } from '@/modules/logger';
import { captureError } from '@/modules/monitoring';
const isProduction = process.env.NODE_ENV === 'production';
export function globalErrorHandler(err: Error, c: Context) {
const requestId = c.get('requestId') as string | undefined;
const log = c.get('log');
// Handle our custom AppError types
if (err instanceof AppError) {
err.context.requestId = requestId;
if (err.statusCode >= 500 || !err.isOperational) {
log?.error('Application error', err.toLogEntry());
captureError(err, err.context);
} else {
log?.warn('Client error', err.toLogEntry());
}
return c.json(err.toClientResponse(), err.statusCode as any);
}
// Handle Zod validation errors
if (err instanceof ZodError) {
const validationError = new ValidationError(err, { requestId });
log?.warn('Validation error', validationError.toLogEntry());
return c.json(validationError.toClientResponse(), 400);
}
// Handle Hono's HTTPException
if (err instanceof HTTPException) {
log?.warn('HTTP exception', { status: err.status, message: err.message, requestId });
return c.json({
error: { code: 'HTTP_ERROR', message: err.message, requestId },
}, err.status);
}
// Unknown/unexpected errors - log full details but sanitize response
logger.error('Unexpected error', {
requestId,
error: err.message,
stack: err.stack,
name: err.name,
});
captureError(err, { requestId });
return c.json({
error: {
code: 'INTERNAL_ERROR',
message: isProduction ? 'An unexpected error occurred' : err.message,
requestId,
},
}, 500);
}
export function notFoundHandler(c: Context) {
const requestId = c.get('requestId');
return c.json({
error: {
code: 'NOT_FOUND',
message: `Route ${c.req.method} ${c.req.path} not found`,
requestId,
},
}, 404);
}
```
### React Query Error Handling
```typescript
// app/resources/core/query-client.ts
import { QueryClient, QueryCache, MutationCache } from '@tanstack/react-query';
import { captureError } from '@/modules/monitoring';
export function createQueryClient() {
return new QueryClient({
defaultOptions: {
queries: {
staleTime: 1000 * 60,
gcTime: 1000 * 60 * 5,
retry: (failureCount, error) => {
// Don't retry on client errors (4xx)
if (isClientError(error)) return false;
return failureCount < 2;
},
retryDelay: (attemptIndex) => Math.min(1000 * 2 ** attemptIndex, 30000),
},
mutations: {
retry: false,
},
},
queryCache: new QueryCache({
onError: (error, query) => {
console.error('[QueryCache Error]', {
queryKey: query.queryKey,
error: error instanceof Error ? error.message : 'Unknown error',
});
captureError(error as Error, { queryKey: JSON.stringify(query.queryKey) });
},
}),
mutationCache: new MutationCache({
onError: (error, variables, context, mutation) => {
console.error('[MutationCache Error]', {
mutationKey: mutation.options.mutationKey,
error: error instanceof Error ? error.message : 'Unknown error',
});
captureError(error as Error, { mutationKey: JSON.stringify(mutation.options.mutationKey) });
},
}),
});
}
function isClientError(error: unknown): boolean {
if (error && typeof error === 'object' && 'statusCode' in error) {
const statusCode = (error as { statusCode: number }).statusCode;
return statusCode >= 400 && statusCode < 500;
}
return false;
}
```
### Client-Side API Error Type
```typescript
// app/resources/core/types.ts
export interface ApiErrorResponse {
error: {
code: string;
message: string;
requestId?: string;
details?: Array<{ field: string; message: string }>;
};
}
export class ApiError extends Error {
public readonly code: string;
public readonly statusCode: number;
public readonly requestId?: string;
public readonly details?: Array<{ field: string; message: string }>;
constructor(response: ApiErrorResponse, statusCode: number) {
super(response.error.message);
this.name = 'ApiError';
this.code = response.error.code;
this.statusCode = statusCode;
this.requestId = response.error.requestId;
this.details = response.error.details;
}
isNotFound() { return this.statusCode === 404; }
isUnauthorized() { return this.statusCode === 401; }
isForbidden() { return this.statusCode === 403; }
isValidation() { return this.code === 'VALIDATION_ERROR'; }
}
```
### Control Class Error Handling Pattern
```typescript
// app/resources/organizations/control.ts
import { ApiError, ApiErrorResponse } from '../core/types';
import type { Organization, CreateOrganizationInput } from './types';
export class OrganizationControl {
constructor(private client: HttpClient) {}
async list(): Promise<Organization[]> {
const response = await this.client.get('/api/organizations');
if (!response.ok) {
const error: ApiErrorResponse = await response.json();
throw new ApiError(error, response.status);
}
return response.json();
}
async get(orgId: string): Promise<Organization> {
const response = await this.client.get(`/api/organizations/${orgId}`);
if (!response.ok) {
const error: ApiErrorResponse = await response.json();
throw new ApiError(error, response.status);
}
return response.json();
}
async create(input: CreateOrganizationInput): Promise<Organization> {
const response = await this.client.post('/api/organizations', input);
if (!response.ok) {
const error: ApiErrorResponse = await response.json();
throw new ApiError(error, response.status);
}
return response.json();
}
}
```
### React Error Boundaries
```typescript
// app/components/error-boundary.tsx
import { Component, ReactNode } from 'react';
import { ApiError } from '@/resources/core/types';
import { captureError } from '@/modules/monitoring';
interface Props {
children: ReactNode;
fallback?: ReactNode;
onError?: (error: Error, errorInfo: React.ErrorInfo) => void;
}
interface State {
error: Error | null;
}
export class ErrorBoundary extends Component<Props, State> {
state: State = { error: null };
static getDerivedStateFromError(error: Error): State {
return { error };
}
componentDidCatch(error: Error, errorInfo: React.ErrorInfo) {
console.error('[ErrorBoundary]', error, errorInfo);
this.props.onError?.(error, errorInfo);
captureError(error, { componentStack: errorInfo.componentStack });
}
render() {
if (this.state.error) {
if (this.props.fallback) return this.props.fallback;
const error = this.state.error;
const isApiError = error instanceof ApiError;
return (
<div className="flex flex-col items-center justify-center p-8 text-center">
<h2 className="text-xl font-semibold text-destructive mb-2">
{isApiError ? 'Request Failed' : 'Something went wrong'}
</h2>
<p className="text-muted-foreground mb-4">{error.message}</p>
{isApiError && error.requestId && (
<p className="text-xs text-muted-foreground font-mono">
Request ID: {error.requestId}
</p>
)}
<button
onClick={() => this.setState({ error: null })}
className="mt-4 px-4 py-2 bg-primary text-primary-foreground rounded"
>
Try Again
</button>
</div>
);
}
return this.props.children;
}
}
// app/components/query-error-boundary.tsx
import { useQueryErrorResetBoundary } from '@tanstack/react-query';
export function QueryErrorBoundary({ children }: { children: ReactNode }) {
const { reset } = useQueryErrorResetBoundary();
return (
<ErrorBoundary
onError={() => reset()}
fallback={
<div className="p-4 border rounded bg-destructive/10">
<p>Failed to load data. Please try again.</p>
<button onClick={() => reset()}>Retry</button>
</div>
}
>
{children}
</ErrorBoundary>
);
}
```
### Error Flow Summary
```
User Action
┌─────────────────────────────────────────────────────────────┐
│ UI Layer (React) │
│ - ErrorBoundary catches render errors │
│ - useQuery/useMutation provide error state │
│ - ApiError class for type-safe error handling │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Control Class (app/resources/{domain}/control.ts) │
│ - Wraps API calls │
│ - Transforms responses to typed errors │
│ - Logs request/response details in development │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Hono Server Routes (app/server/routes/) │
│ - Throws typed AppError subclasses (from @/modules/errors) │
│ - Catches external API errors (ControlPlaneError) │
│ - Logs with request context │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Global Error Handler (app.onError) │
│ - Catches ALL unhandled errors │
│ - Logs full details via @/modules/logger │
│ - Returns sanitized JSON response with requestId │
│ - Reports to Sentry via @/modules/monitoring │
└─────────────────────────────────────────────────────────────┘
Structured Log Output (JSON in production, pretty in dev)
```
---
## Comparison: Old vs New Architecture
### Structure Comparison
| Aspect | Old Architecture | New Architecture |
|--------|------------------|------------------|
| **Server** | Express (~150KB) | Hono (~14KB) |
| **Data Fetching** | Scattered in loaders/components | Centralized in `resources/` |
| **Caching** | None (server-only) | React Query (client + SSR) |
| **Types** | Scattered in interfaces/ | Colocated in domain folders |
| **Control Classes** | Centralized in modules/ | Colocated with domain |
| **Hooks** | Mixed in features/ | Pure data hooks in resources/ |
| **API Routes** | routes/api/ (Express) | server/routes/ (Hono) |
| **Error Handling** | Inconsistent | Typed errors in `modules/errors/` |
| **Logging** | Basic console.log | Structured via `modules/logger/` |
| **Monitoring** | Ad-hoc | Centralized in `modules/monitoring/` |
### Code Organization
**Old Structure:**
```
app/
├── modules/
│ └── control-plane/
│ ├── organization.control.ts
│ ├── project.control.ts
│ └── dns-zones.control.ts
├── resources/
│ ├── interfaces/
│ │ ├── organization.interface.ts
│ │ └── project.interface.ts
│ └── schemas/
│ ├── organization.schema.ts
│ └── project.schema.ts
├── features/
│ └── organization/
│ ├── hooks/
│ │ └── use-organization.ts ❌ Mixed data + UI
│ └── components/
└── routes/
└── api/
└── organizations.ts ❌ Express routes
```
**New Structure:**
```
app/
├── modules/ ✅ Reusable infrastructure
│ ├── errors/ ✅ Typed error classes
│ │ ├── index.ts
│ │ ├── base.ts
│ │ ├── http.ts
│ │ ├── api.ts
│ │ └── validation.ts
│ ├── logger/ ✅ Structured logging
│ │ ├── index.ts
│ │ ├── logger.ts
│ │ └── request-logger.ts
│ ├── monitoring/ ✅ Sentry integration
│ │ ├── index.ts
│ │ └── sentry.ts
│ └── control-plane/ ✅ Existing (unchanged)
│ └── ...
├── server/ ✅ Hono server
│ ├── entry.ts
│ ├── context.ts
│ ├── routes/
│ │ └── organizations.ts
│ └── middleware/
│ ├── request-context.ts ✅ Request tracing
│ ├── request-logger.ts ✅ Request logging
│ └── error-handler.ts ✅ Global error handler
├── resources/ ✅ Domain-driven data
│ ├── core/
│ │ ├── query-client.ts ✅ Error-aware QueryClient
│ │ └── types.ts ✅ ApiError class
│ └── organizations/
│ ├── types.ts
│ ├── schemas.ts
│ ├── keys.ts
│ ├── api.ts
│ ├── api.server.ts
│ ├── hooks.ts
│ └── control.ts ✅ Error-aware control
├── features/
│ └── organization/
│ └── components/ ✅ UI only
└── components/
├── error-boundary.tsx ✅ Error boundaries
└── query-error-boundary.tsx
```
### Pros and Cons
#### Old Architecture
**Pros:**
- Familiar Express patterns
- Simple mental model (loader → component)
- No additional dependencies (React Query)
**Cons:**
- No client-side caching (every navigation = server fetch)
- Bundle size overhead from Express
- Scattered code organization
- Difficult to test data logic in isolation
- No optimistic updates
- Flash of loading states on navigation
- Inconsistent error handling
- Hard to debug production issues
#### New Architecture
**Pros:**
- 40-60% smaller server bundle (Hono vs Express)
- Intelligent client-side caching
- Optimistic updates for better UX
- Background refetching keeps data fresh
- Clear separation of concerns
- Testable data layer
- Type-safe end-to-end
- SSR hydration eliminates loading flashes
- Scalable domain-driven organization
- **Structured error handling with typed errors**
- **Request tracing with requestId**
- **Production-ready logging (JSON format)**
- **Easy debugging via log search**
- **Reusable modules for other projects**
**Cons:**
- Learning curve for React Query patterns
- Additional complexity in cache invalidation
- Migration effort for existing code
- Need to understand hydration/dehydration
---
## Benchmarks
### Bundle Size Comparison
| Package | Size (minified) | Size (gzipped) |
|---------|-----------------|----------------|
| Express | ~150KB | ~50KB |
| Hono | ~14KB | ~5KB |
| **Reduction** | **~90%** | **~90%** |
### Performance Metrics (Expected)
| Metric | Old | New | Improvement |
|--------|-----|-----|-------------|
| Cold Start (server) | ~500ms | ~200ms | 60% faster |
| Time to Interactive | ~2.5s | ~1.8s | 28% faster |
| Subsequent Navigation | ~800ms | ~100ms* | 87% faster |
| API Calls per Session | ~50 | ~20 | 60% reduction |
*With cached data
### Developer Experience Metrics
| Metric | Old | New |
|--------|-----|-----|
| Files to create for new domain | 8-10 (scattered) | 8 (colocated) |
| Lines of boilerplate | ~200 | ~150 |
| Time to add CRUD operations | ~2 hours | ~45 minutes |
| Test isolation possible | Partial | Full |
| **Time to debug production error** | **~30 min** | **~5 min** |
---
## What Makes This Architecture Interesting
### 1. **True Domain-Driven Design**
Each domain is self-contained with all related code colocated:
- Types define the shape
- Schemas validate the data
- Keys organize cache entries
- APIs define how to fetch
- Hooks provide React integration
- Control handles API communication
### 2. **SSR + Client Caching Hybrid**
The architecture uniquely combines:
- Server-side rendering for SEO and initial load
- Client-side caching for fast navigation
- Background refetching for data freshness
- Optimistic updates for instant feedback
### 3. **Zero Loading States on Navigation**
With proper cache configuration:
```typescript
// User navigates to /org/my-org
// If data is cached and fresh → Instant render
// If data is stale → Show cached, refetch in background
// If no cache → Show loading (rare after initial load)
```
### 4. **Type-Safe Data Flow**
```
Control Class → API Response → Types → Schemas → Hooks → Components
↑ ↓
└────────────── Validation at every step ──────────────┘
```
### 5. **Scalable Team Workflow**
- **Domain Teams**: Each team owns their domain folder
- **UI Teams**: Work on features/components without touching data
- **Platform Teams**: Maintain core/ and server/
- **Clear Boundaries**: Import rules prevent coupling
### 6. **Production-Ready Error Handling**
Every error is traceable:
```
User reports issue with requestId
Search logs by requestId
Full request trace: start → control plane calls → error point
Sentry shows stack trace + context
Root cause identified in minutes
```
### 7. **Reusable Infrastructure via Modules**
New modules (`errors/`, `logger/`, `monitoring/`) follow existing `app/modules/` patterns:
- Independent and self-contained
- Easy to extract for other projects
- Clear boundaries with the rest of the application
---
## Production Readiness Review
### Feature Enablement and Rollout
- [x] Feature can be disabled by reverting to Express setup
- [x] Incremental migration possible (domain by domain)
- [x] Rollback plan: Keep Express server alongside during migration
- [x] Feature flags not required (infrastructure change)
### Monitoring Requirements
- Request ID tracing throughout the stack
- Structured JSON logging for production
- Global error handler with sanitized responses
- Sentry integration for error tracking
- Add React Query devtools for development
- Monitor cache hit/miss rates
- Track API call reduction metrics
- Add error boundaries for query failures
### Dependencies
- `@tanstack/react-query`: ^5.x (well-maintained, large community)
- `hono`: ^4.x (actively maintained, Cloudflare-backed)
- `react-router-hono-server`: ^2.x (official integration)
- `nanoid`: ^5.x (request ID generation)
- `@sentry/react`: ^8.x (error monitoring)
- `zod`: ^3.x (validation, already in use)
### Scalability
- QueryClient per request prevents memory leaks
- Cache limits configurable via gcTime
- Background refetch can be disabled for high-traffic scenarios
- Structured logs are easy to aggregate and search
### Environment Variables
```bash
# .env.example
# Logging
LOG_LEVEL=info # debug | info | warn | error | fatal
NODE_ENV=production # development | production
# Monitoring
SENTRY_DSN=https://xxx@sentry.io/xxx
DEPLOYMENT_ENV=production # staging | production
```
---
## Implementation History
| Date | Milestone |
|------|-----------|
| Phase 1 | Hono server structure + middleware |
| Phase 2 | Modules setup (errors, logger, monitoring) |
| Phase 3 | Resources core (query client, options, error types) |
| Phase 4 | Organizations domain migration |
| Phase 5 | Projects domain migration |
| Phase 6 | DNS domain (nested: zones, records) |
| Phase 7 | IAM domain (nested: roles, groups, policies) |
| Phase 8 | Remaining domains |
| Phase 9 | Route updates (SSR prefetch) |
| Phase 10 | Cleanup old modules/control-plane |
---
## Drawbacks
1. **Migration Effort**: Significant upfront investment to migrate existing code
2. **Learning Curve**: Team needs to learn React Query patterns
3. **Cache Complexity**: Cache invalidation requires careful thought
4. **Bundle Size Trade-off**: React Query adds ~12KB (offset by Express removal)
5. **Error Class Overhead**: More code for error handling (offset by debugging benefits)
---
## Appendix A: Quick Reference
### Import Cheat Sheet
```typescript
// From modules (reusable infrastructure)
import { AppError, NotFoundError, ValidationError } from '@/modules/errors';
import { logger, createRequestLogger } from '@/modules/logger';
import { captureError, initMonitoring } from '@/modules/monitoring';
// From resources (data layer)
import {
useOrganizations, // Hook
type Organization, // Type
organizationSchema, // Schema
organizationKeys, // Keys
} from '@/resources/organizations';
// From resources core (shared data utilities)
import { ApiError } from '@/resources/core/types';
// From features (UI layer)
import {
OrganizationCard,
OrganizationList,
} from '@/features/organizations';
// From components (shared UI)
import { Button } from '@/components/ui/button';
import { ErrorBoundary } from '@/components/error-boundary';
// In loaders only (server)
import * as orgServer from '@/resources/organizations/api.server';
```
### Adding a New Domain Checklist
- [ ] Create `resources/{domain}/` folder
- [ ] Define `types.ts` with interfaces
- [ ] Create `schemas.ts` with Zod schemas
- [ ] Set up `keys.ts` with query key factory
- [ ] Implement `api.ts` with query options
- [ ] Implement `api.server.ts` with prefetch helpers
- [ ] Create `hooks.ts` with useQuery/useMutation
- [ ] Move/create `control.ts` for API wrapper (with error handling)
- [ ] Export public API from `index.ts`
- [ ] Add server routes in `server/routes/` (with typed errors from `@/modules/errors`)
- [ ] Update any existing components to use new hooks
- [ ] Wrap feature components with ErrorBoundary
---
## Appendix B: Production Debugging Guide
### When User Reports an Error
1. **Get the Request ID** from error response or user screenshot
2. **Search logs** by `requestId`:
```bash
# Example: search in CloudWatch/Loki
{ requestId="abc123xyz" }
```
3. **Check Sentry** for stack trace and context
4. **Review upstream logs** if error shows `CONTROL_PLANE_ERROR`
5. **Check React Query DevTools** state if UI-related
### Log Search Patterns
```bash
# Find all errors for a user
{ userId="user_123" level="error" }
# Find slow requests
{ duration>5000 }
# Find all requests for an organization
{ organizationId="org_abc" }
# Find specific error type
{ code="CONTROL_PLANE_ERROR" upstream="iam-api" }
```
### Error Response Format
All API errors follow this format:
```json
{
"error": {
"code": "NOT_FOUND",
"message": "Organization with identifier 'abc' not found",
"requestId": "req_xyz123abc"
}
}
```
For validation errors:
```json
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Validation failed",
"requestId": "req_xyz123abc",
"details": [
{ "field": "name", "message": "Name is required" },
{ "field": "email", "message": "Invalid email format" }
]
}
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment