GenAI Memory Operations — Non-Normative Implementation Spec (companion to open-telemetry/semantic-conventions#3250)
GenAI Memory Operations — Non-Normative Implementation Spec (companion to open-telemetry/semantic-conventions#3250)
GenAI Memory Operations — Non-Normative Implementation Spec (companion to open-telemetry/semantic-conventions#3250)
Why Memory Matters for Observability:
- Debugging: Understanding what an agent "remembers" during failures is crucial for root cause analysis.
- Performance: Memory retrieval latency directly impacts agent response times.
- Privacy/Compliance: Tracking what's stored helps with data retention compliance.
- Cost Optimization: Memory storage and retrieval operations have measurable costs.
Add the following values to the gen_ai.operation.name enum:
- id: search_memory
value: "search_memory"
brief: "Search/query memories"
stability: development
- id: update_memory
value: "update_memory"
brief: "Create or update (upsert) memory items"
stability: development
- id: delete_memory
value: "delete_memory"
brief: "Delete memory items"
stability: development
- id: create_memory_store
value: "create_memory_store"
brief: "Create/initialize a memory store"
stability: development
- id: delete_memory_store
value: "delete_memory_store"
brief: "Delete/deprovision a memory store"
stability: developmentNotes:
store_memoryis removed; useupdate_memoryfor both create and update to avoid ambiguity.update_memoryis an upsert. If the underlying system distinguishes create vs update, instrumentations MAY add a system-specific attribute to capture that outcome, but SHOULD keepgen_ai.operation.nameasupdate_memory.
- id: gen_ai.memory.store.id
stability: development
type: string
brief: The unique identifier of the memory store.
examples: ["ms_abc123", "user-preferences-store"]
- id: gen_ai.memory.store.name
stability: development
type: string
brief: Human-readable name of the memory store.
examples: ["Customer Support Memory", "Shopping Preferences"]
- id: gen_ai.memory.record.id
stability: development
type: string
brief: The unique identifier of a memory item.
examples: ["mem_5j66UpCpwteGg4YSxUnt7lPY"]
- id: gen_ai.memory.scope
stability: development
type:
members:
- id: user
value: "user"
brief: "Scoped to a specific user"
stability: development
- id: conversation
value: "conversation"
brief: "Scoped to a conversation/thread. Context within a single conversation."
stability: development
- id: agent
value: "agent"
brief: "Scoped to a specific agent"
stability: development
- id: team
value: "team"
brief: "Shared across a team of agents"
stability: development
brief: The scope of the memory store or memory operation.
examples: ["user", "conversation", "agent"]
- id: gen_ai.memory.record.content
stability: development
type: any
brief: The content/value of the memory item.
note: |
> [!WARNING]
> This attribute may contain sensitive information including user/PII data.
>
> Instrumentations SHOULD NOT capture this by default.
examples:
- '{"preference": "dark_mode", "value": true}'
- id: gen_ai.memory.query
stability: development
type: string
brief: The search query used to retrieve memories.
note: |
> [!WARNING]
> This attribute may contain sensitive information.
examples: ["user dietary preferences", "past flight bookings"]
- id: gen_ai.memory.search.result.count
stability: development
type: int
brief: Number of memory items returned from a search operation.
examples: [3, 10]
- id: gen_ai.memory.search.similarity.threshold
stability: development
type: double
brief: Minimum similarity score threshold used for memory search.
examples: [0.7, 0.85]
- id: gen_ai.memory.expiration_date
stability: development
type: string
brief: Expiration date for the memory in ISO 8601 format.
examples: ["2025-12-31", "2026-01-15T00:00:00Z"]
- id: gen_ai.memory.importance
stability: development
type: double
brief: Importance score of the memory (0.0 to 1.0).
examples: [0.8, 0.95]
- id: span.gen_ai.create_memory_store.client
type: span
stability: development
span_kind: client
brief: >
Describes creation/initialization of a memory store.
note: |
The `gen_ai.operation.name` SHOULD be `create_memory_store`.
**Span name** SHOULD be `create_memory_store {gen_ai.memory.store.name}`
or `create_memory_store` if store name is not available.
attributes:
- ref: gen_ai.operation.name
requirement_level: required
- ref: gen_ai.provider.name
requirement_level: required
- ref: gen_ai.memory.store.id
requirement_level:
conditionally_required: when returned by the operation
- ref: gen_ai.memory.store.name
requirement_level: recommended
- ref: gen_ai.memory.scope
requirement_level: required
- ref: error.type
requirement_level:
conditionally_required: if the operation ended in an error- id: span.gen_ai.search_memory.client
type: span
stability: development
span_kind: client
brief: >
Describes a memory search/retrieval operation - querying a memory store
for relevant memories.
note: |
The `gen_ai.operation.name` SHOULD be `search_memory`.
**Span name** SHOULD be `search_memory {gen_ai.memory.store.name}`
or `search_memory` if store name is not available.
attributes:
- ref: gen_ai.operation.name
requirement_level: required
- ref: gen_ai.provider.name
requirement_level: required
- ref: gen_ai.memory.store.id
requirement_level:
conditionally_required: if applicable
- ref: gen_ai.memory.store.name
requirement_level: recommended
- ref: gen_ai.memory.query
requirement_level: opt_in
- ref: gen_ai.memory.search.result.count
requirement_level: recommended
- ref: gen_ai.memory.search.similarity.threshold
requirement_level:
conditionally_required: when similarity filtering is used
- ref: gen_ai.agent.id
requirement_level:
conditionally_required: when searching agent-scoped memory
- ref: gen_ai.conversation.id
requirement_level:
conditionally_required: when searching conversation-scoped memory
- ref: error.type
requirement_level:
conditionally_required: if the operation ended in an error- id: span.gen_ai.update_memory.client
type: span
stability: development
span_kind: client
brief: >
Describes a memory update operation (upsert) - creating or modifying
memory items in a memory store.
note: |
The `gen_ai.operation.name` SHOULD be `update_memory`.
This operation is an upsert to avoid ambiguity between create vs update.
**Span name** SHOULD be `update_memory {gen_ai.memory.store.name}`
or `update_memory` if store name is not available.
attributes:
- ref: gen_ai.operation.name
requirement_level: required
- ref: gen_ai.provider.name
requirement_level: required
- ref: gen_ai.memory.store.id
requirement_level:
conditionally_required: if applicable
- ref: gen_ai.memory.store.name
requirement_level: recommended
- ref: gen_ai.memory.record.id
requirement_level:
conditionally_required: when available (provided or returned)
- ref: gen_ai.memory.record.content
requirement_level: opt_in
- ref: gen_ai.memory.expiration_date
requirement_level:
conditionally_required: if expiration is set
- ref: gen_ai.memory.importance
requirement_level: recommended
- ref: gen_ai.agent.id
requirement_level:
conditionally_required: when operating on agent-scoped memory
- ref: gen_ai.conversation.id
requirement_level:
conditionally_required: when operating on conversation-scoped memory
- ref: error.type
requirement_level:
conditionally_required: if the operation ended in an error- id: span.gen_ai.delete_memory.client
type: span
stability: development
span_kind: client
brief: >
Describes a memory deletion operation - removing one or more memory items.
note: |
The `gen_ai.operation.name` SHOULD be `delete_memory`.
**Span name** SHOULD be `delete_memory {gen_ai.memory.store.name}`
or `delete_memory` if store name is not available.
Deletion semantics SHOULD be interpreted as follows:
- If `gen_ai.memory.record.id` is set, delete a specific memory item.
- If `gen_ai.memory.record.id` is not set, delete all memory items in the specified
attributes:
- ref: gen_ai.operation.name
requirement_level: required
- ref: gen_ai.provider.name
requirement_level: required
- ref: gen_ai.memory.store.id
requirement_level:
conditionally_required: if applicable
- ref: gen_ai.memory.store.name
requirement_level: recommended
- ref: gen_ai.memory.scope
requirement_level: required
- ref: gen_ai.memory.record.id
requirement_level:
conditionally_required: when deleting a specific memory item
- ref: gen_ai.agent.id
requirement_level:
conditionally_required: when deleting agent-scoped memory
- ref: gen_ai.conversation.id
requirement_level:
conditionally_required: when deleting conversation-scoped memory
- ref: error.type
requirement_level:
conditionally_required: if the operation ended in an error- id: span.gen_ai.delete_memory_store.client
type: span
stability: development
span_kind: client
brief: >
Describes deletion/deprovisioning of a memory store.
note: |
The `gen_ai.operation.name` SHOULD be `delete_memory_store`.
**Span name** SHOULD be `delete_memory_store {gen_ai.memory.store.name}`
or `delete_memory_store` if store name is not available.
attributes:
- ref: gen_ai.operation.name
requirement_level: required
- ref: gen_ai.provider.name
requirement_level: required
- ref: gen_ai.memory.store.id
requirement_level:
conditionally_required: if applicable
- ref: gen_ai.memory.store.name
requirement_level: recommended
- ref: error.type
requirement_level:
conditionally_required: if the operation ended in an errorThe following 6 stories demonstrate how the proposed memory spans and attributes work in real-world scenarios. Each story shows the exact trace hierarchy and span attributes that instrumentations would produce.
Scenario: Sarah contacts TechCorp support about a billing issue. The support agent maintains conversation context within a session while retrieving relevant information from past interactions.
Trace:
invoke_agent "CustomerSupportBot" (1500ms)
├── create_memory_store "session-context" (15ms)
│ ├── gen_ai.operation.name: "create_memory_store"
│ ├── gen_ai.provider.name: "pinecone"
│ ├── gen_ai.memory.store.id: "store_session_abc123"
│ ├── gen_ai.memory.store.name: "session-context"
│ ├── gen_ai.memory.scope: "conversation"
│ └── gen_ai.conversation.id: "conv_session_abc123"
│
├── search_memory "user-history" (45ms)
│ ├── gen_ai.operation.name: "search_memory"
│ ├── gen_ai.provider.name: "pinecone"
│ ├── gen_ai.memory.store.id: "store_user_sarah_123_history"
│ ├── gen_ai.memory.store.name: "user-history"
│ ├── gen_ai.memory.query.text: "billing issue duplicate charge" [opt-in]
│ ├── gen_ai.memory.search.similarity.threshold: 0.7
│ ├── gen_ai.memory.search.result.count: 3
│ └── gen_ai.conversation.id: "conv_session_abc123"
│
├── chat "gpt-4" (1200ms)
│ ├── gen_ai.operation.name: "chat"
│ ├── gen_ai.usage.input_tokens: 1500
│ └── gen_ai.usage.output_tokens: 250
│
├── update_memory "session-context" (20ms)
│ ├── gen_ai.operation.name: "update_memory"
│ ├── gen_ai.provider.name: "pinecone"
│ ├── gen_ai.memory.store.name: "session-context"
│ ├── gen_ai.memory.record.id: "turn_001"
│ ├── gen_ai.memory.expiration_date: "2026-02-25T17:30:00Z" (24h TTL)
│ └── gen_ai.conversation.id: "conv_session_abc123"
│
│ ... (additional conversation turns: search → chat → update) ...
│
└── delete_memory "session-context" (25ms)
├── gen_ai.operation.name: "delete_memory"
├── gen_ai.provider.name: "pinecone"
├── gen_ai.memory.store.id: "store_session_abc123"
├── gen_ai.memory.store.name: "session-context"
├── gen_ai.memory.scope: "conversation"
└── gen_ai.conversation.id: "conv_session_abc123"
Key spans demonstrated: create_memory_store, search_memory, update_memory, delete_memory
Why observability matters: If Sarah says "I already told you my account number" and the agent asks again, engineers can check the search_memory span — was result_count = 0? Was similarity.threshold too high? Did the session memory expire (expiration_date)?
Scenario: Mike uses ShopSmart, which learns his preferences over time. He explicitly states he prefers sustainable products, the system infers he likes minimalist designs from browsing, and eventually he exercises GDPR deletion rights.
Trace (preference learning):
invoke_agent "ShoppingAssistant" (2000ms)
├── create_memory_store "user-preferences" (20ms)
│ ├── gen_ai.operation.name: "create_memory_store"
│ ├── gen_ai.provider.name: "pinecone"
│ ├── gen_ai.memory.store.id: "store_user_mike_456_prefs"
│ ├── gen_ai.memory.store.name: "user-preferences"
│ ├── gen_ai.memory.scope: "user"
│
├── update_memory "user-preferences" (30ms)
│ ├── gen_ai.operation.name: "update_memory"
│ ├── gen_ai.memory.store.name: "user-preferences"
│ ├── gen_ai.memory.record.id: "pref_sustainable_001"
│ ├── gen_ai.memory.importance: 0.9 (explicit preference)
│ └── gen_ai.memory.record.content: '{"preference": "sustainable_products"}' [opt-in]
│
├── update_memory "user-preferences" (25ms)
│ ├── gen_ai.operation.name: "update_memory"
│ ├── gen_ai.memory.store.name: "user-preferences"
│ ├── gen_ai.memory.record.id: "pref_minimalist_002"
│ └── gen_ai.memory.importance: 0.75
│
├── search_memory "user-preferences" (40ms)
│ ├── gen_ai.operation.name: "search_memory"
│ ├── gen_ai.memory.store.name: "user-preferences"
│ ├── gen_ai.memory.query.text: "laptop recommendations" [opt-in]
│ ├── gen_ai.memory.search.similarity.threshold: 0.6
│ └── gen_ai.memory.search.result.count: 5
│
├── chat "gpt-4" (1100ms)
│
├── update_memory "user-preferences" (25ms)
│ ├── gen_ai.operation.name: "update_memory"
│ ├── gen_ai.memory.record.id: "pref_sustainable_001"
│ └── gen_ai.memory.importance: 0.1 (user downgraded)
│
└── delete_memory "user-preferences" (35ms)
├── gen_ai.operation.name: "delete_memory" (GDPR request)
├── gen_ai.memory.store.id: "store_user_mike_456_prefs"
├── gen_ai.memory.store.name: "user-preferences"
└── gen_ai.memory.scope: "user" (bulk delete all)
Key spans demonstrated: create_memory_store, update_memory (with importance + merge strategy), search_memory (with similarity threshold), delete_memory (GDPR bulk)
Why observability matters: When Mike says "Why did you recommend leather boots? I said I prefer sustainable products!" — engineers can trace the update_memory span that lowered importance to 0.1 for pref_sustainable_001, and the search_memory span to see if the threshold filtered it out.
Scenario: ResearchCo deploys a team of specialized agents — Researcher, Analyst, Writer — to produce an EV market research report. They share findings via team-scoped memory while maintaining private procedural memory.
Trace:
invoke_agent "ResearchCrew" (8500ms)
│
├── create_memory_store "ev-research-team" (20ms)
│ ├── gen_ai.operation.name: "create_memory_store"
│ ├── gen_ai.provider.name: "milvus"
│ ├── gen_ai.memory.store.id: "store_team_ev_research_2025"
│ ├── gen_ai.memory.store.name: "ev-research-team"
│ ├── gen_ai.memory.scope: "team"
│
├── invoke_agent "Researcher" (3000ms)
│ ├── chat "gpt-4" (1800ms)
│ │
│ ├── update_memory "ev-research-team" (30ms)
│ │ ├── gen_ai.operation.name: "update_memory"
│ │ ├── gen_ai.memory.store.name: "ev-research-team"
│ │ ├── gen_ai.memory.record.id: "finding_market_size_001"
│ │ └── gen_ai.agent.id: "researcher_agent" (attribution)
│ │
│ └── update_memory "ev-research-team" (25ms)
│ ├── gen_ai.operation.name: "update_memory"
│ ├── gen_ai.memory.record.id: "finding_regional_data_002"
│ └── gen_ai.agent.id: "researcher_agent"
│
├── invoke_agent "Analyst" (2500ms)
│ ├── create_memory_store "analyst-procedures" (15ms)
│ │ ├── gen_ai.operation.name: "create_memory_store"
│ │ ├── gen_ai.memory.store.name: "analyst-procedures"
│ │ ├── gen_ai.memory.scope: "agent" (private)
│ │
│ ├── update_memory "analyst-procedures" (20ms)
│ │ ├── gen_ai.operation.name: "update_memory"
│ │ ├── gen_ai.memory.store.name: "analyst-procedures"
│ │ ├── gen_ai.memory.scope: "agent"
│ │ └── gen_ai.agent.id: "analyst_agent"
│ │
│ ├── chat "gpt-4" (1500ms)
│ │
│ └── update_memory "ev-research-team" (30ms)
│ ├── gen_ai.operation.name: "update_memory"
│ ├── gen_ai.memory.store.name: "ev-research-team"
│ ├── gen_ai.memory.record.id: "analysis_growth_projection_003"
│ └── gen_ai.agent.id: "analyst_agent"
│
└── invoke_agent "Writer" (2800ms)
├── search_memory "ev-research-team" (50ms)
│ ├── gen_ai.operation.name: "search_memory"
│ ├── gen_ai.memory.store.name: "ev-research-team"
│ ├── gen_ai.memory.query.text: "EV market size growth projections" [opt-in]
│ ├── gen_ai.memory.search.result.count: 8
│ └── gen_ai.agent.id: "writer_agent"
│
└── chat "gpt-4" (2500ms)
Key spans demonstrated: create_memory_store (team + agent scope), update_memory (with gen_ai.agent.id attribution and append strategy), search_memory (cross-agent retrieval)
Why observability matters: If the final report is missing the Analyst's growth projections, engineers can trace: Did analyst_agent actually call update_memory with strategy: append? Did writer_agent's search_memory return the expected 8 results? The gen_ai.agent.id attribute enables attribution — which agent contributed what.
Scenario: CloudAssist is a B2B SaaS platform providing AI assistants to enterprise customers. Each tenant (ACME Corp, TechCo) has isolated data via namespaces, but all share access to global product documentation.
Trace (tenant onboarding → usage → offboarding):
# Tenant Onboarding
create_memory_store "tenant-store" (50ms)
├── gen_ai.operation.name: "create_memory_store"
├── gen_ai.provider.name: "pinecone"
├── gen_ai.memory.store.id: "store_tenant_acme"
├── gen_ai.memory.store.name: "tenant-store"
├── gen_ai.memory.scope: "global"
# Tenant employee stores data
update_memory "tenant-store" (25ms)
├── gen_ai.operation.name: "update_memory"
├── gen_ai.memory.store.name: "tenant-store"
├── gen_ai.memory.record.id: "acme_q4_projection_001"
# Tenant-scoped search (returns only ACME data)
search_memory "tenant-store" (40ms)
├── gen_ai.operation.name: "search_memory"
├── gen_ai.memory.store.name: "tenant-store"
├── gen_ai.memory.query.text: "Q4 revenue projections" [opt-in]
└── gen_ai.memory.search.result.count: 12
# Global search (shared product docs, no namespace)
search_memory "global-docs" (35ms)
├── gen_ai.operation.name: "search_memory"
├── gen_ai.memory.store.name: "global-docs"
├── gen_ai.memory.query.text: "CloudAssist API rate limits" [opt-in]
└── gen_ai.memory.search.result.count: 3 (no namespace = global)
# Tenant offboarding (complete removal)
delete_memory_store "tenant-store" (100ms)
├── gen_ai.operation.name: "delete_memory_store"
├── gen_ai.memory.store.id: "store_tenant_acme"
├── gen_ai.memory.store.name: "tenant-store"
Key spans demonstrated: create_memory_store (namespaced), update_memory, search_memory (namespaced vs global), delete_memory_store (tenant offboarding)
Why observability matters: To verify tenant data isolation, query traces for search_memory spans: a search with namespace: "tenant_acme" should never return results from tenant_techco. The search.result.count across namespaces provides isolation verification. During offboarding, the delete_memory_store span confirms complete data removal.
Scenario: Three real-world debugging and compliance scenarios that demonstrate WHY memory observability matters.
Scenario A — Agent "forgot" context (debugging):
invoke_agent "CustomerSupportBot" (2000ms)
└── search_memory "conversation-history" (45ms)
├── gen_ai.operation.name: "search_memory"
├── gen_ai.memory.store.name: "conversation-history"
├── gen_ai.memory.search.result.count: 0 ← BUG: No results!
├── gen_ai.memory.search.similarity.threshold: 0.95 ← Root cause: too strict
└── gen_ai.conversation.id: "conv_xyz789"
Fix: Lower similarity.threshold from 0.95 to 0.7. Relevant memories scored 0.7–0.85 but were filtered out.
Scenario B — Compliance audit (who accessed what):
-- Query traces by conversation_id to get full audit trail
SELECT timestamp, operation_name, agent_id, memory_store_id, namespace, result_count
FROM traces
WHERE conversation_id = 'conv_audit_12345'
AND namespace = 'ns_user_sarah_123'
ORDER BY timestamp;Scenario C — Slow responses (performance):
invoke_agent "ShoppingAssistant" (5200ms)
├── search_memory "user-preferences" (850ms) ✓
├── search_memory "product-catalog" (3800ms) ✗ SLOW
│ ├── gen_ai.memory.search.result.count: 50000 ← Too many results!
│ └── (no similarity.threshold set) ← Missing filter!
└── chat "gpt-4" (450ms) ✓
Fix: Add similarity.threshold: 0.8 and limit results. Latency drops from 3800ms to 200ms.
Key insight: The search.result.count and similarity.threshold attributes on search_memory spans enable root-cause analysis for all three scenarios without any code changes.
Scenario: Alex is a DataAware user who exercises various GDPR deletion rights — from selective item deletion to complete account removal.
Trace (cascading deletion):
# Phase 1: Selective deletion (single item)
delete_memory "user-preferences" (20ms)
├── gen_ai.operation.name: "delete_memory"
├── gen_ai.memory.store.id: "store_user_alex_789_prefs"
├── gen_ai.memory.store.name: "user-preferences"
├── gen_ai.memory.record.id: "pref_embarrassing_001" (specific item)
└── gen_ai.memory.scope: "user"
# Phase 2: Delete by scope (all conversation history)
delete_memory "conversation-history" (35ms)
├── gen_ai.operation.name: "delete_memory"
├── gen_ai.memory.store.id: "store_user_alex_789_history"
├── gen_ai.memory.store.name: "conversation-history"
└── gen_ai.memory.scope: "user"
# Phase 3: Bulk delete all items (scope-based)
delete_memory "user-preferences" (50ms)
├── gen_ai.operation.name: "delete_memory"
├── gen_ai.memory.store.id: "store_user_alex_789_prefs"
└── gen_ai.memory.scope: "user" (all items in scope)
delete_memory "conversation-history" (45ms)
├── gen_ai.operation.name: "delete_memory"
├── gen_ai.memory.store.id: "store_user_alex_789_history"
└── gen_ai.memory.scope: "user"
# Phase 4: Delete the stores themselves (complete removal)
delete_memory_store "user-preferences" (30ms)
├── gen_ai.operation.name: "delete_memory_store"
├── gen_ai.memory.store.id: "store_user_alex_789_prefs"
└── gen_ai.memory.store.name: "user-preferences"
delete_memory_store "conversation-history" (25ms)
├── gen_ai.operation.name: "delete_memory_store"
├── gen_ai.memory.store.id: "store_user_alex_789_history"
└── gen_ai.memory.store.name: "conversation-history"
delete_memory_store "personal-data" (25ms)
├── gen_ai.operation.name: "delete_memory_store"
├── gen_ai.memory.store.id: "store_user_alex_789_personal"
└── gen_ai.memory.store.name: "personal-data"
Key spans demonstrated: delete_memory (by ID, by scope) and delete_memory_store (complete removal)
Why observability matters: GDPR requires proving data was actually deleted. The trace provides:
- Selective deletion:
gen_ai.memory.record.ididentifies the exact item removed - Scope-based deletion:
gen_ai.memory.scope: "user"confirms all user data was removed - Store deletion:
delete_memory_storespans confirm complete deprovisioning - Audit trail: Timestamps on all spans provide the compliance timeline
| Story | create_memory_store |
search_memory |
update_memory |
delete_memory |
delete_memory_store |
|---|---|---|---|---|---|
| 1. Customer Support | ✅ session scope | ✅ similarity threshold, long_term | ✅ expiration_date, short_term | ✅ session cleanup | |
| 2. Shopping Assistant | ✅ user scope | ✅ similarity threshold | ✅ importance, merge strategy | ✅ GDPR bulk (scope) | |
| 3. Multi-Agent Research | ✅ team + agent scope | ✅ cross-agent retrieval | ✅ agent attribution, append strategy | ||
| 4. Multi-Tenant SaaS | ✅ namespace isolation | ✅ namespaced + global | ✅ namespaced | ✅ tenant offboarding | |
| 5. Compliance & Debug | ✅ threshold debugging, audit trail | ✅ audit trail | |||
| 6. GDPR Lifecycle | ✅ by ID, by scope | ✅ complete removal |
| Attribute | Stories | Example Values |
|---|---|---|
gen_ai.memory.store.id |
1–6 | "store_session_abc123", "store_tenant_acme" |
gen_ai.memory.store.name |
1–6 | "session-context", "user-preferences", "ev-research-team" |
gen_ai.memory.record.id |
1,2,3,5,6 | "turn_001", "pref_sustainable_001", "finding_market_size_001" |
gen_ai.memory.scope |
1–4,6 | "conversation", "user", "agent", "team" |
gen_ai.memory.query.text |
1,2,3,4,5 | "billing issue", "laptop recommendations" (opt-in) |
gen_ai.memory.record.content |
2 | '{"preference": "sustainable_products"}' (opt-in) |
gen_ai.memory.importance |
2 | 0.9, 0.75, 0.1 |
gen_ai.memory.expiration_date |
1 | "2026-02-25T17:30:00Z" |
gen_ai.memory.search.result.count |
1–5 | 3, 5, 8, 12, 50000 |
gen_ai.memory.search.similarity.threshold |
1,2,5 | 0.7, 0.6, 0.95 |
gen_ai.agent.id |
3,5 | "researcher_agent", "analyst_agent", "writer_agent" |
gen_ai.conversation.id |
1,5 | "conv_session_abc123", "conv_xyz789" |
This section shows how popular LangChain memory classes map to the proposed semantic conventions, providing guidance for instrumentation authors.
| LangChain Class | gen_ai.operation.name |
gen_ai.memory.scope |
Notes |
|---|---|---|---|
ConversationBufferMemory.save_context() |
update_memory |
session |
Each turn stored with conversation context |
ConversationBufferMemory.load_memory_variables() |
search_memory |
session |
Retrieves conversation history |
ConversationBufferMemory.clear() |
delete_memory |
session |
Clears session memory |
VectorStoreRetrieverMemory |
search_memory |
user |
Similarity-based retrieval with similarity.threshold |
EntityMemory.save_context() |
update_memory |
user |
Extracts entities; maps to importance scoring |
EntityMemory.load_memory_variables() |
search_memory |
user |
Entity lookup with similarity.threshold |
ConversationSummaryMemory.save_context() |
update_memory |
session |
Maps to update.strategy: merge |
| Shared VectorStore (multi-agent) | search_memory / update_memory |
team |
Use gen_ai.agent.id for attribution |
# What LangChain does internally:
memory.save_context(
inputs={"input": "What's my order status?"},
outputs={"output": "Your order #1234 shipped yesterday."}
)
# What the instrumentation emits:
# Span: update_memory "session-context"
# gen_ai.operation.name: "update_memory"
# gen_ai.memory.store.name: "session-context"
# gen_ai.memory.scope: "conversation"
# gen_ai.memory.record.id: "turn_001"
# gen_ai.conversation.id: "conv_abc123"
# gen_ai.memory.expiration_date: "2026-02-25T18:00:00Z"# EntityMemory extracts entities and scores them
# High importance = explicit user preference
# Medium importance = inferred from behavior
# Span: update_memory "user-preferences"
# gen_ai.operation.name: "update_memory"
# gen_ai.memory.store.name: "user-preferences"
# gen_ai.memory.scope: "user"
# gen_ai.memory.importance: 0.9
# gen_ai.memory.record.content: '{"entity": "sustainability", "preference": true}' [opt-in]# In CrewAI/LangGraph patterns, agents share a VectorStore backend.
# Each agent's writes are attributed via gen_ai.agent.id.
# Agent "researcher" writes to shared memory:
# Span: update_memory "team-knowledge"
# gen_ai.operation.name: "update_memory"
# gen_ai.memory.store.name: "team-knowledge"
# gen_ai.memory.scope: "team"
# gen_ai.agent.id: "researcher_agent"
# Agent "writer" reads from shared memory:
# Span: search_memory "team-knowledge"
# gen_ai.operation.name: "search_memory"
# gen_ai.memory.store.name: "team-knowledge"
# gen_ai.memory.scope: "team"
# gen_ai.memory.search.result.count: 8
# gen_ai.agent.id: "writer_agent"| Framework | Memory API | gen_ai.operation.name |
|---|---|---|
| Mem0 | client.add() |
update_memory |
| Mem0 | client.search() |
search_memory |
| Mem0 | client.delete() |
delete_memory |
| CrewAI | memory.remember() |
update_memory |
| CrewAI | memory.recall() |
search_memory |
| CrewAI | memory.forget() |
delete_memory |
| CrewAI | memory.reset() |
delete_memory (scope-based) |
| AutoGen | memory.add() |
update_memory |
| AutoGen | memory.query() |
search_memory |
| AutoGen | memory.clear() |
delete_memory |
| Letta/MemGPT | blocks.create() |
update_memory |
| Letta/MemGPT | blocks.retrieve() |
search_memory |
| Letta/MemGPT | blocks.delete() |
delete_memory |
This section documents how each major framework implements memory scoping and how their concepts map to the gen_ai.memory.scope enum values.
| Scope Value | Google ADK | AWS Bedrock AgentCore | Azure AI Foundry | Mem0 | CrewAI | Letta/MemGPT |
|---|---|---|---|---|---|---|
user |
✅ user_id param (required on all APIs) |
✅ actorId param (short-term) / namespace /actors/{id}/ (long-term) |
✅ scope="user_123" or scope="{{$userId}}" (auto-resolved from auth) |
✅ user_id param on add()/search() |
✅ scope("/user/alice") or source="user:alice" |
human block label convention (per-agent, no cross-agent user concept) |
conversation |
session_id exists but is ignored in search — memory is cross-conversation by design |
✅ sessionId param on create_event (short-term memory only) |
❌ Not a first-class concept — memory is explicitly cross-session | ✅ run_id param |
❌ No explicit session/conversation param | |
agent |
❌ No concept — all agents in same app_name share memory |
✅ memoryId (one memory resource per agent) or namespace segment |
✅ memory_store_name (one store per agent) |
✅ agent_id param |
✅ memory.scope("/agent/researcher") |
✅ Agent ID — blocks + archival scoped to agent (primary isolation unit) |
team |
❌ No concept (could encode in app_name) |
❌ Convention only (shared memoryId + broad namespace prefix) |
✅ scope="team_alpha" — docs mention "a team, or another identifier" |
❌ No built-in (filter composition across org_id) |
✅ memory.slice(scopes=["/agent/a", "/agent/b"]) |
✅ Shared memory block attached to N agents |
Scoping is via two required parameters on every API call: app_name (application-level isolation) and user_id (user-level isolation). There is no agent-level or team-level scoping — all agents within the same app_name share memory for a given user_id.
# BaseMemoryService contract — scoping is always (app_name, user_id)
await memory.search_memory(app_name="myapp", user_id="user_123", query="preferences")
# session_id is available but NOT used for search scoping
# InMemoryService stores by session_id but searches across ALL sessions for a user
# VertexAiMemoryBankService discards session_id entirelyInstrumentation guidance: Set gen_ai.memory.scope to user for all ADK memory operations since user_id is the primary and only meaningful isolation boundary.
Uses a hierarchical namespace-path system for long-term memory and explicit actorId/sessionId for short-term memory (events).
# Short-term: explicit actor + session scoping
client.create_event(memoryId='mem-abc', actorId='User1', sessionId='conv-001', ...)
# Long-term: namespace-path based scoping
client.retrieve_memory_records(
memoryId='mem-abc',
namespace='/strategies/semantic1/actors/User1/', # prefix-matched
searchCriteria={'searchQuery': 'preferences', 'topK': 5}
)
# Broad retrieval (all actors)
client.retrieve_memory_records(memoryId='mem-abc', namespace='/', ...)Instrumentation guidance: Map actorId → user scope, sessionId → conversation scope. The memoryId implicitly provides agent-level isolation. Team scope would require convention-based namespace paths.
Uses a single scope string parameter on every API call. The value is developer-defined — there is no predefined enum.
# User-scoped (manual)
client.memory_stores.search_memories(name="store", scope="user_123", ...)
# User-scoped (auto-resolved from auth token)
client.memory_stores.search_memories(name="store", scope="{{$userId}}", ...)
# Team-scoped (developer-defined)
client.memory_stores.search_memories(name="store", scope="team_alpha", ...)
# GDPR deletion by scope
client.memory_stores.delete_scope(name="store", scope="user_123")Instrumentation guidance: The scope parameter value directly maps to gen_ai.memory.scope. When the value matches a well-known pattern (e.g., contains "user", matches {{$userId}}), use user. For team identifiers, use team. The memory_store_name provides agent-level isolation.
Uses flat entity-ID tags as parameters on every API call. No hierarchy — isolation is filter-based.
# User-scoped
client.add(messages, user_id="user_123")
client.search(query, user_id="user_123")
# Agent-scoped
client.add(messages, agent_id="support_bot")
client.search(query, agent_id="support_bot")
# Run/conversation-scoped
client.add(messages, run_id="conv_456")
# Combined scoping
client.search(query, user_id="user_123", agent_id="support_bot")Instrumentation guidance: Map the primary scoping parameter used in the API call: user_id → user, agent_id → agent, run_id → conversation. When multiple are present, use the narrowest applicable scope.
Uses hierarchical path-based scopes with an expressive tree model.
# Agent-scoped
scope = memory.scope("/agent/researcher")
scope.save(content="Found 3 relevant papers")
results = scope.search("papers")
# User-scoped with privacy
memory.save(content="User prefers dark mode", source="user:alice", private=True)
# Team-scoped via slice (cross-scope view)
team_view = memory.slice(scopes=["/agent/researcher", "/agent/writer"], read_only=True)Instrumentation guidance: Parse the scope path to determine the scope value: /user/* → user, /agent/* → agent. For slices spanning multiple agents, use team.
Uses an agent-centric block model where each agent owns labeled memory blocks.
# Agent-scoped (default — blocks belong to the agent)
agent = client.create_agent(memory_blocks=[
{"label": "persona", "value": "I am a helpful assistant"},
{"label": "human", "value": "User preferences: dark mode"},
])
# Team-scoped (shared block across agents)
shared_block = client.create_block(label="team_context", value="Project goals...")
agent_a = client.create_agent(block_ids=[shared_block.id])
agent_b = client.create_agent(block_ids=[shared_block.id])Instrumentation guidance: Default to agent scope since blocks are agent-owned. When a shared block is detected (attached to multiple agents), use team.
The user scope is the most universally supported (5/6 frameworks have explicit support). The agent scope is well-supported (5/6). The conversation scope is partial (3/6 — Google ADK and Azure explicitly design memory as cross-session). The team scope is emergent (3/6 have native support, others can approximate via workarounds).
All four enum values (user, conversation, agent, team) have concrete mappings in at least 3 frameworks, validating the proposed scope enum.
This section addresses reviewer feedback about whether memory operations should be separate spans or an expansion of the agent span with gen_ai.operation.name: memory.
Memory operations are modeled as separate client spans rather than expanding the agent span for the following reasons:
OTel database conventions define separate client spans per operation with db.operation.name distinguishing them (e.g., SELECT, INSERT). Memory operations follow this same established pattern with gen_ai.operation.name values like search_memory, update_memory, etc.
The agent span (invoke_agent) represents the orchestration layer. Memory operations are I/O operations that happen within an agent invocation, similar to how database calls happen within a service request.
Separate spans enable clear trace hierarchy:
invoke_agent (agent span)
├── search_memory (memory span - retrieves context)
├── chat (inference span - LLM call)
└── update_memory (memory span - stores result)
Separate spans allow:
- Duration metrics per operation type
- Error rates per operation
- Performance analysis (which memory operation is slow?)
If we used gen_ai.operation.name: memory on the agent span:
- We would lose granularity (cannot distinguish search vs update vs delete)
- We would need sub-attributes like
gen_ai.memory.operationanyway - It would be inconsistent with database conventions
This section addresses why memory operations need dedicated gen_ai.memory.* attributes rather than reusing db.* conventions.
| Aspect | Database (db.*) |
GenAI Memory (gen_ai.memory.*) |
Unique to Memory? |
|---|---|---|---|
| System | db.system.name (postgresql) |
gen_ai.provider.name (pinecone) |
No |
| Operation | db.operation.name (SELECT) |
gen_ai.operation.name (search_memory) |
No |
| Target | db.collection.name (users) |
gen_ai.memory.store.name |
No |
| Query | db.query.text |
gen_ai.memory.query.text |
No |
| Scope | N/A | gen_ai.memory.scope (user, session, agent) |
YES |
| Importance | N/A | gen_ai.memory.importance |
YES |
| Agent Context | N/A | gen_ai.agent.id, gen_ai.conversation.id |
YES |
| Similarity Search | N/A | gen_ai.memory.search.similarity.threshold |
YES |
-
Semantic vs Physical Scope: Memory uses semantic isolation (user, session, agent) vs database physical isolation (schema, namespace).
-
AI Context: Memory carries
gen_ai.agent.id,gen_ai.conversation.id- meaningless for databases. -
Importance Scoring: Memory items have
gen_ai.memory.importance(0.0-1.0) affecting retrieval and retention. -
Similarity-Based Retrieval:
gen_ai.memory.search.similarity.thresholdis fundamental to vector-based memory retrieval. -
Memory is an Abstraction: Memory providers vary widely - some use vector databases (Pinecone, Chroma), others use in-memory stores, key-value caches, or custom backends. Not all memory providers use a database at all.
Instrumentations:
- SHOULD emit
gen_ai.memory.*attributes for AI-specific observability - MAY additionally emit
db.*attributes when the underlying storage is a database, for infrastructure-level correlation - Memory spans carry GenAI-specific semantic meaning that
db.*alone cannot express
This section clarifies the distinction between search_memory and the existing retrieval operation in GenAI semantic conventions.
| Aspect | Retrieval (gen_ai.retrieval.*) |
Memory (gen_ai.memory.*) |
|---|---|---|
| Purpose | Fetch grounding context from external sources | Manage persistent agent state |
| Data Source | External documents, knowledge bases | Agent-owned context |
| Lifecycle | Read-only (fetch) | Full CRUD (create, read, update, delete) |
| Scope | Global knowledge | User/session/agent-specific |
| Persistence | External system manages | Agent manages lifecycle |
| Example | RAG from documentation | Remember user preferences |
# Retrieval: Only fetches
retrieval → documents
# Memory: Full lifecycle
create_memory_store → store created
search_memory → results (like retrieval)
update_memory → item stored
delete_memory → item removed
delete_memory_store → store removed
-
Retrieval: "What does the documentation say about X?"
- Source: External knowledge base
- Agent does NOT modify the source
-
Memory: "What did this user tell me before?"
- Source: Agent-owned persistent state
- Agent creates, updates, and deletes
- Retrieval: Same query → same results (assuming static docs)
- Memory: Results change based on prior agent interactions
| Scenario | Operation |
|---|---|
| Search product documentation | retrieval |
| Query external API for facts | retrieval |
| RAG from knowledge base | retrieval |
| Find user past preferences | search_memory |
| Recall conversation context | search_memory |
| Multi-agent shared state | search_memory (team scope) |
search_memory IS similar to retrieval:
- Both query for relevant context
- Both return results with scores
- Both use similarity thresholds
BUT search_memory operates on agent-managed memory, not external knowledge.
Not all frameworks distinguish memory retrieval from general retrieval at the API level. The correct operation depends on the framework's abstraction:
Use search_memory when the framework has explicit memory CRUD operations:
- Mem0:
client.search()/client.add()/client.delete() - LangMem (LangChain):
create_search_memory_tool()/create_manage_memory_tool() - CrewAI:
memory.recall()/memory.remember()/memory.forget() - AutoGen:
memory.query()/memory.add()/memory.clear() - Letta/MemGPT:
blocks.retrieve()/blocks.create()/blocks.delete()
Use retrieval when the framework uses a generic retrieval interface that does not
distinguish memory from external knowledge (e.g., LangChain's on_retriever_start
callback). In this case, supplemental gen_ai.memory.* attributes (such as
added to the retrieval span when metadata indicates the retriever is backed by agent
memory.
This section explains why memory operations belong in the gen_ai.* namespace rather than using generic db.* conventions.
| Database Operation | Memory Operation |
|---|---|
| Store customer record | Remember user preference |
| Query orders table | Recall relevant context |
| Update inventory count | Learn from interaction |
| Delete old logs | Forget outdated information |
Memory operations have semantic intent (remember, recall, learn, forget) that databases do not capture.
Memory spans carry AI context (conversation, agent, similarity) that is meaningless for databases:
gen_ai.agent.id- Which agent accessed memorygen_ai.conversation.id- Links to conversation flowgen_ai.memory.importance- Semantic importance scoregen_ai.memory.search.similarity.threshold- Vector similarity cutoff
A single "memory" might involve:
- Vector database (Pinecone) for semantic search
- Key-value store (Redis) for session state
- Document database (MongoDB) for user profiles
Memory is an abstraction over storage, not a storage system itself.
- Expiration: Memory items expire based on semantic rules (24h session, 30d preference)
- Importance: Items have importance scores affecting retention
- Scope propagation: Deleting user scope cascades to all related items
These are AI-specific lifecycle concerns not present in database conventions.
- Database metrics: Infrastructure teams, DBAs
- Memory metrics: AI engineers, ML ops
AI engineers need memory-specific dashboards, not generic database monitoring. Memory operations must correlate with gen_ai.* spans (chat, invoke_agent), not with generic service requests.