A comprehensive list of functional features available in Solo Enterprise for Agentgateway, categorized by LLM and MCP capabilities.
- Route to OpenAI
- Route to Anthropic Claude
- Route to AWS Bedrock via IAM or API key auth
- Route to Azure OpenAI
- Route to Google Vertex AI via OAuth or service account
- Route to Google Gemini
- Route to any OpenAI-compatible provider
- Path-based routing to different providers/models
- Header-based routing to different providers/models
- Query parameter-based routing to different providers/models
- Model override per backend (force a specific model regardless of client request)
- Unified
/v1/chat/completionsinterface across all providers
- Streaming responses supported across all providers
- Time to First Token (TTFT) metrics
- Tokens Per Output Token (TPOT) metrics
- API key masking
- JWT token validation with JWKS
- Claims-based RBAC using CEL expressions
- TLS termination
- Frontend mTLS
- SNI matching
- Prompt enrichment
- Built-in guardrails: string matching patterns
- Built-in guardrails: regex patterns
- Built-in guardrails: built-in detectors
- Guardrail actions: reject requests
- Guardrail actions: mask sensitive content
- External moderation API integration
- Custom guardrails via webhook API
- Request-based rate limiting
- Per-user / per-key rate limiting via headers
- Local token-based rate limiting
- Global token-based rate limiting
- Configurable request timeouts
- Retry policies with error code matching
- Retry backoff configuration
- LLM failover with priority groups
- Health-based routing
- Intra-pool failover
- Failover from rate-limited backends to healthy backends
- Direct response
- CEL expression-based request transformations
- CEL expression-based response transformations
- Header enrichment
- Batches API support
- Embeddings API support
- Models API passthrough
- Function calling / tool use support
- Claude Code CLI integration
- Token usage metrics
- Cost tracking
- LLM request count and error rate metrics
- Request duration metrics
- Per-model and per-route metrics
- Access logs for LLM requests
- Pre-built Grafana dashboards for LLM monitoring
- Static MCP backend routing
- Dynamic MCP backend routing
- Multiplex MCP
- In-cluster MCP server routing
- Remote MCP server routing
- HTTPS connectivity to upstream MCP servers
- SSE protocol support
- Streamable HTTP protocol support
- Centralized MCP tool server registry
- MCP OAuth 2.0 authentication
- Dynamic MCP client registration with IdP
- JWT authentication for MCP servers
- Claims-based RBAC for MCP access using CEL expressions
- Tool-level access control
- Principle of least privilege
- On-behalf-of (OBO) token exchange
- Elicitations
- Full visibility into tool interactions
- MCP request metrics in Grafana dashboards
- Distributed tracing for MCP tool calls
- OpenTelemetry (OTel) integration
- Centralized log collection with Grafana Loki
- Distributed tracing with Grafana Tempo
- Metrics collection with Prometheus
- Unified telemetry collection with OTel Collector
- Grafana dashboards with pre-built panels
- Control plane metrics
- Dashboard
- Playground
- Tracing viewer
- Gateway management view
- Route management view
- Destination management view
- Policy management view
- Elicitations management view
- RBAC for UI access
- OIDC-secured UI access
- Built on Kubernetes Gateway API
- Gateway and GatewayClass resources
- HTTPRoute for traffic routing
- Custom CRDs for backends, policies, rate limiting, and auth
- Helm-based installation
- Air-gapped installation support
- OpenShift support
- Configurable proxy replicas and resources