Skip to content

Instantly share code, notes, and snippets.

@euforic
Created February 21, 2026 03:10
Show Gist options
  • Select an option

  • Save euforic/e5095c5a7ca750f27bde49ebed8f225a to your computer and use it in GitHub Desktop.

Select an option

Save euforic/e5095c5a7ca750f27bde49ebed8f225a to your computer and use it in GitHub Desktop.
picoclaw agent loop parity notes vs codex

Agent Loop Parity Update (vs /tmp/codex)

Date: 2026-02-21

Scope

  • pkg/agent/loop.go
  • pkg/agent/context.go
  • pkg/agent/loop_latest_code_test.go
  • docs/review/agent-loop-vs-codex-2026-02-20.md

What changed

Core loop parity moves

  • Session-aware intake and follow-up queueing
    • Session workers drain per-session pending input from the worker queue before each turn.
    • runTurn consumes queued follow-up input on iteration 1, follow-up-only mode on iteration >1.
  • Per-iteration context rebuild
    • Each iteration rebuilds prompt context from current session history.
    • contextLookupMessage is split from turnContextMessage so compaction/rebuilds can use stable lookup text while iteration 1 can send overridden queued content.
  • Compaction parity points
    • Added pre-sampling compaction with failure propagation (compactContextForIteration).
    • Added post-sampling follow-up compaction gate (shouldCompactForFollowUp) and immediate follow-up return when compacting is needed.
    • Added compact threshold calculation via autoCompactLimitFromUsage.
  • Retry and model fallback behavior
    • Added bounded sampling retry loop with retryability filtering (maxSamplingRetries, isSamplingErrorRetryable, isClientError).
    • Model fallback path preserved through callLLMWithModelFallback.
  • Follow-up and tool handling parity
    • runTurn now cleanly bifurcates:
      • no tool call: return final response unless follow-up input exists;
      • tool call: execute tool set unless compaction-for-follow-up preempts.
  • Prompt/message dedupe
    • pkg/agent/context.go avoids appending duplicate trailing user turns and empty user turns.

New follow-up context behavior

  • Iteration 1: pending input (text/media) is used as turn prompt, while contextLookupMessage remains from latest user input source for compaction/context-building continuity.
  • Iteration >1: no queued input is used as prompt; context lookup falls back to latest history user message when opts.UserMessage is empty.

Tests added/updated

  • pkg/agent/loop_latest_code_test.go
    • TestRunTurnIteration2IgnoresPendingAsPrimaryInput
      • Verifies follow-up iterations do not incorrectly use queued input text as main model prompt.
    • TestRunTurnIteration2FallsBackToSessionHistoryForContextLookup
      • Verifies lookup falls back to latest persisted user history when original request is empty.
    • TestRunTurnUsesLatestPendingInputForFirstIteration
      • Confirms first-iteration prompt uses latest pending turn text.

Mermaid flow 1: session intake and turn loop

flowchart TD
    A[Run receives inbound message] --> B[Resolve session key]
    B --> C[Enqueue into per-session worker queue]
    C --> D[processSessionMessage]
    D --> E{Any queued follow-up input?}
    E -->|yes| F[collect pending input]
    E -->|no| G[runAgentLoop]
    F --> G
    G --> H[runAgentLoop saves inbound user message]
    H --> I[submissionLoop]
    I --> J[runTurn]
    J --> K[collect pending input]
    K --> L[buildTurnMessages with history + context lookup]
    L --> M[pre-sampling compaction]
    M --> N[runSamplingRequest]
    N --> O{needs follow-up?}
    O -->|no| P[Handle final response path]
    O -->|yes| Q[Tool calls or pending input detected]
    Q --> R{Tool calls requested}
    R -->|yes| S[run tools (unless post-compact forces follow-up)]
    R -->|no| T[If pending input exists, continue loop]
    S --> U[append tool messages]
    T --> J
    U --> J
    P --> V[Persist assistant response]
    V --> W[Return final message]
Loading

Mermaid flow 2: sampling + compaction retry decisions

flowchart TD
    A[tryRunSamplingRequest] --> B[call provider + model fallbacks]
    B --> C{sampling succeeded?}
    C -->|yes| D[parse needs_follow_up + content]
    C -->|no| E[is error retryable?]
    E -->|no| F[return error]
    E -->|yes| G[increment retry and wait backoff]
    G --> B
    D --> H{needs follow-up}
    H -->|no| I[final response]
    H -->|yes| J[estimate usage]
    J --> K{utilization >= hard ceiling or limit reached?}
    K -->|yes| L[compact + continue follow-up]
    K -->|no| M[execute tools]
    M --> N[loop continues]
Loading

Transport parity status

  • Codex runs a session-scoped ModelClientSession with transport preference:
    • primary responses_websocket
    • on retry exhaustion or websocket eligibility conditions it switches permanently to HTTP for the session via try_switch_fallback_transport.
  • Current picoclaw provider abstraction is HTTP-only (/chat/completions) and has no websocket transport/session-sticky transport state.
  • Current parity gap:
    • No websocket transport path.
    • No transport fallback state machine.
    • No websocket reconnect/retry telemetry yet.
  • Implication:
    • Model fallback behavior exists, but transport fallback parity is blocked by interface/runtime limits.

Remaining gaps (same)

  1. Transport-state parity (session sticky transport, websocket fallback, reconnect warnings).
  2. Context compaction semantics (Codex auto_compact_token_limit + model-window handoff logic vs current usage-pressure heuristic).
  3. Pending-input semantics (Codex async channel + stream interruption vs worker-local queue).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment