Skip to content

Instantly share code, notes, and snippets.

@manisnesan
Last active December 17, 2025 13:05
Show Gist options
  • Select an option

  • Save manisnesan/0011dc4d5261b8db4929bbeea0d6230e to your computer and use it in GitHub Desktop.

Select an option

Save manisnesan/0011dc4d5261b8db4929bbeea0d6230e to your computer and use it in GitHub Desktop.
Agentic Product Lifecycle: Consolidated Report

Agentic Product Lifecycle: Consolidated Report


1. The Key Shift: Traditional RAG → Naive Agent

Traditional RAG (Vending Machine)
─────────────────────────────────
Query → Retrieve → Generate → Answer
        (fixed pipeline, no decisions)

Naive Agentic RAG (Junior Librarian)
────────────────────────────────────
Question → [DECISION: search?] → Query → Retrieve → [DECISION: use docs?] → Answer
           (decision points introduced)
Aspect Traditional RAG Naive Agent
Control flow Fixed pipeline Agent decides when/whether to search
Failure mode Silent (bad docs → bad answer) Same, but can be taught to notice
Observability Query → Docs → Answer Query → Tool Decision → Docs → Answer

2. Three Observability Surfaces

User Question
     │
     ▼
┌─────────────────────────────────────────────────────┐
│  DECISION QUALITY                                   │
│  • Did I call the tool only when needed?            │
│  • Did I translate the question into a good query?  │
└─────────────────────────────────────────────────────┘
     │
     ▼ query
┌─────────────────────────────────────────────────────┐
│  SEARCH RELEVANCE                                   │
│  • Did Solr return good docs?                       │
│  • Did the right pipeline get selected?             │
└─────────────────────────────────────────────────────┘
     │
     ▼ docs
┌─────────────────────────────────────────────────────┐
│  REASONING QUALITY                                  │
│  • Did I synthesize correctly from docs?            │
│  • Did I refuse when I should have?                 │
└─────────────────────────────────────────────────────┘
     │
     ▼
   Answer

Key insight: Failures can hide. A bad answer might be caused by any of the three surfaces, and they can mask each other.


3. Your Architecture

┌──────────────────────────────────────────────────────────────┐
│  AGENT LAYER                                                 │
│  Decision: "Is this about our products?" → search / don't    │
└──────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────┐
│  SEARCH LAYER                                                │
│  Query Classification → {error, cve, errata, default}        │
│  Pipeline-specific: boosting, field selection, filters       │
└──────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────┐
│  CONTENT LAYER                                               │
│  Technical docs, KCS articles, CVEs, Errata                  │
└──────────────────────────────────────────────────────────────┘

4. Policy Constraint

Grounding Rule: Agent must refuse or explicitly state when retrieved docs don't support an answer.
Operational Risk > User Frustration from "I don't know"


5. Failure Taxonomy by Surface

Surface Failure Type Severity Example
Decision Quality False negative High Product question routed to "don't search"
False positive Low Non-product question triggers unnecessary search
Search Relevance Misclassification High CVE question routed to default pipeline
Ranking failure Medium Right pipeline, wrong docs in top_k
Reasoning Quality Grounding violation Critical Agent answers despite weak doc support
Over-refusal (False refusal) Medium Agent refuses when docs did support answer
Correct refusal N/A (signal) Docs genuinely missing — corpus gap indicator

High-severity failures (given your policy):

  1. Grounding violation (agent hallucinates)
  2. Misclassification (CVE routed wrong → wrong docs → plausible but dangerous answer)

6. Lifecycle Phases

Phase 1: Problem Framing (Pre-Agent)

Input Output
Query logs Intent taxonomy: {error, cve, errata, default}
Clickthrough logs Definition of "answerability"
Corpus structure Policy: prefer refusal over hallucination

Phase 2: Content Corpus Lifecycle (Foundation)

Phase Key Activities Failure if Neglected
Onboarding Ingest new docs, assign metadata, validate structure New product features undiscoverable
Maintenance Update for product changes, deprecations Agent cites outdated procedures
Retirement Remove/archive unsupported product docs Agent retrieves irrelevant content
Quality Audit Check formatting, metadata completeness Parsing failures, misclassification

Your mitigation: Content from last 3 years only + freshness boosting

Signals to track:

  • Staleness score (last-modified vs product-EOL)
  • Metadata completeness
  • Formatting violations
  • Orphan docs

Phase 3: Search Relevance Lifecycle (Agent-Aware)

Traditional vs Agent-Aware Relevance:

Concern Human Consumer Agent Consumer
Top result quality Critical Less critical (agent sees top_k)
Recall@K Nice to have Critical
Snippet/formatting For readability For parseability
Redundancy in top_k Minor annoyance Wastes context window
Missing docs User might rephrase Agent may confidently hallucinate

Solr Readiness Checklist:

Area Concerns Lifecycle Artifact
Schema Right fields indexed? Copyfields? Field inventory document
Analyzers Stemming, synonyms, domain jargon Analyzer config + known gaps
Embeddings (if hybrid) Model choice, chunk boundaries Embedding eval report
Query-time Boosts, filters, per-pipeline tuning Pipeline config registry
Content quality Formatting, stale docs, missing metadata Content audit log

Key Signals:

  • Recall@K
  • Missing doc rate (corpus gap indicator)
  • Blind spots (acronyms, synonyms, jargon failures)

Phase 4: Relevance Factor Decision Records (RFDRs)

Traceability artifact linking relevance changes to agent behavior.

Field Purpose
ID / Version e.g., RFDR-2025-007
Date When change was deployed
Change Type Analyzer, boost, synonym, new field, embedding model
Rationale Why this change was made
Baseline Metrics Before Recall@K, precision, known blind spots
Baseline Metrics After Same metrics post-change
Downstream Agent Impact Observed/expected effect on decision/reasoning quality
Rollback Plan How to revert if agent behavior degrades

Correlation Pattern:

Agent behavior regression detected
        ↓
Check: recent RFDRs?
        ↓
   RFDR-2025-007: synonym expansion on 2025-06-01
        ↓
Compare agent metrics before/after that date
        ↓
Root cause identified

7. End-to-End Lifecycle Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                        AGENTIC PRODUCT LIFECYCLE                            │
└─────────────────────────────────────────────────────────────────────────────┘

 CONTENT CORPUS          SEARCH RELEVANCE           AGENT BEHAVIOR
 LIFECYCLE               LIFECYCLE                  LIFECYCLE
 ─────────────           ────────────────           ──────────────
 
 ┌───────────┐           ┌───────────────┐          ┌──────────────┐
 │ Onboard   │           │ Schema/       │          │ Decision     │
 │ New Docs  │──────────▶│ Analyzers     │          │ Policy       │
 └───────────┘           └───────────────┘          │ (search Y/N) │
       │                        │                   └──────────────┘
       ▼                        ▼                          │
 ┌───────────┐           ┌───────────────┐                 │
 │ Maintain  │           │ Query         │                 ▼
 │ & Update  │──────────▶│ Pipelines     │──────────▶┌──────────────┐
 └───────────┘           │ (cve,errata,  │          │ Tool Call    │
       │                 │  error,default)│          │ (Solr query) │
       ▼                 └───────────────┘          └──────────────┘
 ┌───────────┐                  │                          │
 │ Retire    │                  ▼                          ▼
 │ Stale     │           ┌───────────────┐          ┌──────────────┐
 │ Content   │           │ Relevance     │          │ Reasoning    │
 └───────────┘           │ Eval/Tune     │          │ & Grounding  │
       │                 └───────────────┘          └──────────────┘
       ▼                        │                          │
 ┌───────────┐                  ▼                          ▼
 │ Quality   │           ┌───────────────┐          ┌──────────────┐
 │ Audit     │◀──────────│ RFDR          │◀─────────│ Agent Eval   │
 └───────────┘           │ (versioned)   │          │ (3 surfaces) │
                         └───────────────┘          └──────────────┘
       │                        │                          │
       └────────────────────────┴──────────────────────────┘
                                │
                         FEEDBACK LOOPS
                    (attribute failures, iterate)

8. Summary: What Your Lifecycle Manages

What How Artifact
User Intents Query classification → pipeline routing Intent taxonomy
Content Health 3-year window + freshness boost + audits Content audit log
Search Relevance Recall@K, blind spots, pipeline tuning Baseline relevance report
Change Traceability RFDR linking relevance → agent behavior Decision record registry
Agent Policy Grounding rule: refuse > hallucinate Policy constraint doc
Failure Attribution 3-surface observability Failure taxonomy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment