Skip to content

Instantly share code, notes, and snippets.

@wiggitywhitney
Created February 26, 2026 20:44
Show Gist options
  • Select an option

  • Save wiggitywhitney/2086d6acd50669b59bcfbbce434485a2 to your computer and use it in GitHub Desktop.

Select an option

Save wiggitywhitney/2086d6acd50669b59bcfbbce434485a2 to your computer and use it in GitHub Desktop.

Making GenAI Observable with OpenTelemetry

Associated Thunder episode: Making GenAI Observable with OpenTelemetry

Making GenAI Observable with OpenTelemetry


What is OpenTelemetry?

OpenTelemetry is a model for how to translate system events into useful data for observability.

OpenTelemetry is a standard, software, and a specification. (opentelemetry.io)

OTel is made up of:

  • APIs
  • SDKs
  • Tools (Collector, etc)
  • Protocol (OTLP) — "glossary"
  • Semantic Conventions

OpenTelemetry provides an observability pattern for direct use.


OTel and GenAI (so hot right now!)

3 Angles:

  1. Building/training Large Language Models (or any machine learning stuff) — not many doing this

  2. Building GenAI/LLM features in your software — i.e. see new furniture in your own home before buying

  3. Using GenAI/LLM in your coding workflow — i.e. Claude Code, GenAI Agent (very helpful here!)


GenAI Semantic Conventions

For example, user-GenAI chat conversations have standardized trace data.

  1. Provides consistent ways to see GenAI data
  2. Use GenAI actions with consistent patterns and rules

Building GenAI-Powered Features: How Can OTel Help?

OTel turns GenAI data into data you can understand and reason about (just like any other data).

Deterministic vs non-deterministic? Just an implementation detail!

GenAI features need user feedback alongside performance data.

OTel offers ways to store this feedback in metadata alongside system data.

GenAI success = happy user — as opposed to system success being something like a successful database write.

Key Definitions:

  • Tools: LLMs use to access external systems — MCP servers, plug-ins
  • Agent: LLM using tools in a loop — Claude Code, Cursor, Windsurf, Handmade
  • LLM: Large Language Model

Using GenAI in Your Coding Workflow: How Can OTel Help?

Fundamental Challenge of Doing Observability on GenAI:

There is a lot of implicit knowledge LLMs don't know.

  1. OTel and semantic conventions have been around a long time and have wide adoption — LLMs were trained on this! They know it! Also makes instrumenting easier.

  2. More code and less-understood code in production — OMG! You really need OTel!

  3. Use an MCP server to feed OTel data back into coding agents — use a vendored one! Or write your own! Easier with OTel!

  4. OTel data piped back into coding env can help to code verification, debugging, and coding agents' "architectural blindness"

AND SO MUCH MORE — THIS IS THE BLEEDING EDGE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment