- Executive Summary
- Scope and Working Definition
- Landscape: Closest Prior Art
- What Seems New in This POC Direction
- POC Architecture Recommendation
- Data Model for Persistence
- Safety and Reliability Guardrails
- Suggested Build Plan (2 Weeks)
- Evaluation Criteria
- References
A persistent generative UI POC is very feasible in 2026. We already have strong evidence that models can generate rich interfaces in production contexts (Google Generative UI blog, Google Generative UI paper) and that modern agent UI stacks can persist state across runs and sessions when backed by durable storage (openai-apps-state, agui-state, ai-sdk-message-persistence, copilotkit-loading-agent-state).
The open opportunity is the combination: user-influenced whole-interface generation plus robust cross-session memory and controllable evolution over time, instead of only form/schema generation or single-session adaptive rendering (a2ui-announcement, a2ui-repo, ai-sdk-gen-ui, arxiv-malleable-ui, arxiv-alignui).
For this brief, "persistent generative UI" means:
- The system generates or refines meaningful parts of an entire task UI (layout, controls, flow, and display logic), not just static text or one-off widget outputs (google-generative-ui-blog, ai-sdk-gen-ui).
- End users can influence that UI through conversation and direct manipulation, and those preferences survive future logins (arxiv-malleable-ui, openai-apps-state).
- The app rehydrates both task context and UI intent from durable storage, then continues evolving safely under policy constraints (agui-state, copilotkit-threads, ai-sdk-rsc-saving-state).
Google describes model-generated visual/interactive experiences for arbitrary prompts in Gemini/Search contexts (google-generative-ui-blog). Vercel provides prompt-to-project pipelines that can generate full-stack app code and deployment artifacts (vercel-v0-platform-api). OpenAI Apps SDK documents state boundaries for business state, widget state, and cross-session durable state (openai-apps-state).
These systems show that dynamic, generated interfaces and persistence are each tractable, but they do not fully define a canonical pattern for long-horizon, user-specific, continuously evolving "living interfaces" across arbitrary task domains.
A2UI is explicitly designed as declarative, portable UI intent across trust boundaries (a2ui-announcement, a2ui-repo). AG-UI provides shared state mechanics with snapshot/delta synchronization (agui-state). CopilotKit persistence docs describe thread-based restoration of message and agent state when backend persistence exists (copilotkit-loading-agent-state, copilotkit-threads).
This stack strongly suggests a viable POC approach: declarative UI intent + state transport + durable backend profile/state store.
Recent work on malleable UIs proposes model-driven interface generation with user customization and context continuity (arxiv-malleable-ui). AlignUI shows methods to align generated UIs with user preferences using preference datasets (arxiv-alignui). Conceptual framing work also reinforces generative UI as iterative co-creation rather than one-shot generation (arxiv-working-definition-gen-ui).
Together, these papers indicate clear momentum toward preference-aware and persistent generated interfaces, while leaving substantial productization room.
The highest-leverage novelty is not raw generation quality; it is the persistence contract:
- Store and version user-visible UI intent (schema/layout/flow), not only chat text.
- Store preference priors that influence future generation.
- Rehydrate deterministically and apply controlled model deltas.
- Keep rollback, auditability, and policy validation first-class.
This "durable UI memory" framing is consistent with platform state guidance (openai-apps-state) and with shared-state protocol models (agui-state), while extending them to whole-UI evolution.
Use a declarative schema-first pipeline rather than free-form generated executable UI code.
- Generation layer: LLM proposes UI schema/delta from current task + stored preferences + current UI state (a2ui-repo, google-generative-ui-paper).
- Validation layer: enforce component allowlist, schema typing, policy checks, and migration/version checks (openai-apps-state, a2ui-announcement).
- Render layer: map approved schema to trusted local component catalog (a2ui-repo).
- Persistence layer: save conversation, state snapshots/deltas, profile vectors, schema versions (agui-state, ai-sdk-message-persistence).
Persist at least these entities per user/workspace:
ui_schema_versions(versioned schema, parent pointers, changelog summary).ui_runtime_state(current values, expanded panels, selections, filters, in-progress form data).preference_profile(explicit likes/dislikes, inferred interaction priors, confidence).conversation_threads(message history and tool events) (ai-sdk-message-persistence, copilotkit-threads).safety_audit_log(validation failures, blocked components, policy reasons).
Treat schema and profile as durable; treat transient widget state as recoverable but lower criticality unless product UX requires strict continuity (openai-apps-state).
- Never render arbitrary generated code directly in host UI.
- Require strict schema validation before render.
- Enforce per-component capability policy (e.g., no network-capable component unless approved).
- Add rollback controls for every accepted schema change.
- Add deterministic fallback UI when generation fails.
These guardrails align with declarative protocol motivations and trust-boundary concerns (a2ui-announcement, openai-apps-state).
- Week 1: Implement schema format, renderer mapping, persistence tables, rehydration flow, and manual preference controls.
- Week 1: Add LLM delta generation endpoint and validation pipeline.
- Week 2: Add session resume logic, rollback/version UI, and audit logging.
- Week 2: Run usability sessions on continuity, controllability, and trust.
Use an existing stack for acceleration: AI SDK persistence primitives (ai-sdk-message-persistence, ai-sdk-rsc-saving-state) or AG-UI/CopilotKit thread-state patterns (agui-state, copilotkit-loading-agent-state).
Primary:
- Rehydration fidelity after logout/login.
- User-rated alignment of regenerated UI with prior preferences.
- Time-to-task versus baseline static UI.
- Error rate from invalid/unsafe UI generations blocked pre-render.
Secondary:
- Percentage of turns requiring manual override.
- Rollback frequency and recovery success.
- Preference drift stability across sessions.