Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save kevinmichaelchen/ae300cad1b4e607ba4903ccb3acab185 to your computer and use it in GitHub Desktop.

Select an option

Save kevinmichaelchen/ae300cad1b4e607ba4903ccb3acab185 to your computer and use it in GitHub Desktop.
Persistent Generative UI POC Research Brief

Persistent Generative UI POC Research Brief

Table of Contents

Executive Summary

A persistent generative UI POC is very feasible in 2026. We already have strong evidence that models can generate rich interfaces in production contexts (Google Generative UI blog, Google Generative UI paper) and that modern agent UI stacks can persist state across runs and sessions when backed by durable storage (openai-apps-state, agui-state, ai-sdk-message-persistence, copilotkit-loading-agent-state).

The open opportunity is the combination: user-influenced whole-interface generation plus robust cross-session memory and controllable evolution over time, instead of only form/schema generation or single-session adaptive rendering (a2ui-announcement, a2ui-repo, ai-sdk-gen-ui, arxiv-malleable-ui, arxiv-alignui).

Scope and Working Definition

For this brief, "persistent generative UI" means:

  1. The system generates or refines meaningful parts of an entire task UI (layout, controls, flow, and display logic), not just static text or one-off widget outputs (google-generative-ui-blog, ai-sdk-gen-ui).
  2. End users can influence that UI through conversation and direct manipulation, and those preferences survive future logins (arxiv-malleable-ui, openai-apps-state).
  3. The app rehydrates both task context and UI intent from durable storage, then continues evolving safely under policy constraints (agui-state, copilotkit-threads, ai-sdk-rsc-saving-state).

Landscape: Closest Prior Art

Product Systems

Google describes model-generated visual/interactive experiences for arbitrary prompts in Gemini/Search contexts (google-generative-ui-blog). Vercel provides prompt-to-project pipelines that can generate full-stack app code and deployment artifacts (vercel-v0-platform-api). OpenAI Apps SDK documents state boundaries for business state, widget state, and cross-session durable state (openai-apps-state).

These systems show that dynamic, generated interfaces and persistence are each tractable, but they do not fully define a canonical pattern for long-horizon, user-specific, continuously evolving "living interfaces" across arbitrary task domains.

Protocols and Frameworks

A2UI is explicitly designed as declarative, portable UI intent across trust boundaries (a2ui-announcement, a2ui-repo). AG-UI provides shared state mechanics with snapshot/delta synchronization (agui-state). CopilotKit persistence docs describe thread-based restoration of message and agent state when backend persistence exists (copilotkit-loading-agent-state, copilotkit-threads).

This stack strongly suggests a viable POC approach: declarative UI intent + state transport + durable backend profile/state store.

Research Signals

Recent work on malleable UIs proposes model-driven interface generation with user customization and context continuity (arxiv-malleable-ui). AlignUI shows methods to align generated UIs with user preferences using preference datasets (arxiv-alignui). Conceptual framing work also reinforces generative UI as iterative co-creation rather than one-shot generation (arxiv-working-definition-gen-ui).

Together, these papers indicate clear momentum toward preference-aware and persistent generated interfaces, while leaving substantial productization room.

What Seems New in This POC Direction

The highest-leverage novelty is not raw generation quality; it is the persistence contract:

  1. Store and version user-visible UI intent (schema/layout/flow), not only chat text.
  2. Store preference priors that influence future generation.
  3. Rehydrate deterministically and apply controlled model deltas.
  4. Keep rollback, auditability, and policy validation first-class.

This "durable UI memory" framing is consistent with platform state guidance (openai-apps-state) and with shared-state protocol models (agui-state), while extending them to whole-UI evolution.

POC Architecture Recommendation

Use a declarative schema-first pipeline rather than free-form generated executable UI code.

Data Model for Persistence

Persist at least these entities per user/workspace:

  1. ui_schema_versions (versioned schema, parent pointers, changelog summary).
  2. ui_runtime_state (current values, expanded panels, selections, filters, in-progress form data).
  3. preference_profile (explicit likes/dislikes, inferred interaction priors, confidence).
  4. conversation_threads (message history and tool events) (ai-sdk-message-persistence, copilotkit-threads).
  5. safety_audit_log (validation failures, blocked components, policy reasons).

Treat schema and profile as durable; treat transient widget state as recoverable but lower criticality unless product UX requires strict continuity (openai-apps-state).

Safety and Reliability Guardrails

  • Never render arbitrary generated code directly in host UI.
  • Require strict schema validation before render.
  • Enforce per-component capability policy (e.g., no network-capable component unless approved).
  • Add rollback controls for every accepted schema change.
  • Add deterministic fallback UI when generation fails.

These guardrails align with declarative protocol motivations and trust-boundary concerns (a2ui-announcement, openai-apps-state).

Suggested Build Plan (2 Weeks)

  1. Week 1: Implement schema format, renderer mapping, persistence tables, rehydration flow, and manual preference controls.
  2. Week 1: Add LLM delta generation endpoint and validation pipeline.
  3. Week 2: Add session resume logic, rollback/version UI, and audit logging.
  4. Week 2: Run usability sessions on continuity, controllability, and trust.

Use an existing stack for acceleration: AI SDK persistence primitives (ai-sdk-message-persistence, ai-sdk-rsc-saving-state) or AG-UI/CopilotKit thread-state patterns (agui-state, copilotkit-loading-agent-state).

Evaluation Criteria

Primary:

  1. Rehydration fidelity after logout/login.
  2. User-rated alignment of regenerated UI with prior preferences.
  3. Time-to-task versus baseline static UI.
  4. Error rate from invalid/unsafe UI generations blocked pre-render.

Secondary:

  1. Percentage of turns requiring manual override.
  2. Rollback frequency and recovery success.
  3. Preference drift stability across sessions.

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment