TravnikovDev/your-agent-isnt-slow-your-telemetry-is-a-hairball-most-teams-bolt-analytics.md

## your-agent-isnt-slow-your-telemetry-is-a-hairball-most-teams-bolt-analytics.md

      
    Raw
  

              your-agent-isnt-slow-your-telemetry-is-a-hairball-most-teams-bolt-analytics.md
            
          
    Your agent isn’t slow. Your telemetry is a hairball. Most teams bolt “analytics” onto agents like fairy lights. Cute in staging, smoke in production.
Here’s the boring truth that works: instrument the seams, not the guts. Put hooks at the agent step, the MCP client, and the MCP server adapters. Keep tools clean. No vendor SDKs sprinkled through business code.
My AI research agent pulled real patterns from MCP stacks, and the signal is clear: teams that trace the boundaries fix issues in minutes. The rest drown in logs and guesswork.
The blueprint:

One tiny TelemetryPort. Two adapters behind it: OpenTelemetry for observability, your analytics sink for product events. Separate pipes, shared IDs.
Interceptors do the work. Client side wraps every JSON-RPC call as a span. Server adapters wrap tools/resources/prompts. Agent host wraps each step.
Name things like you mean it: spans as agent.step, mcp.rpc, mcp.tools.call. Events as agent_step_completed, tool_call_completed, outcome_emitted. Correlate with trace_id, span_id, session_id, user_id_hash, mcp.request_id, tool_name. No raw PII. Hash or fingerprint.

What it feels like in practice:

agent.step plan_search starts.
It calls MCP tools/call get_docs. Client creates mcp.rpc. Server creates mcp.tools.call. Downstream HTTP spans appear automatically.
Observability shows the full chain with timings and errors. Product analytics logs a tiny set of business events with the same trace_id. You can click from outcome_emitted to the exact failing HTTP call. No logging lines in tool code. 🛠️

The catch:

Cost and noise explode if you trace everything. Sample aggressively. Summarize arguments, don’t store prompts by default.
MCP and OTel AI conventions are still moving. Keep a thin translation layer for attributes so your charts don’t break next month.
Double counting is real. Pick one side to emit product events. Use traces for the rest.

My take: do it once, do it at the seams, or pay the spaghetti tax forever. Build the adapters, wire the interceptors, ship the scrubber. That’s it.
Want a minimal TypeScript repo with this pattern wired up end to end? Comment “MCP BLUEPRINT” and I’ll share the skeleton. 🔍
No results found