Created
November 28, 2025 10:03
-
-
Save DamirGadiev/d35d39ef81aa77a676c10d6510cc004f to your computer and use it in GitHub Desktop.
Antigravity Prompt: Build an In-Browser PDF Chat App.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ## Project Goal | |
| Build a **privacy-first PDF question-answering application** that runs 100% in the browser with **no backend**. | |
| - All documents and embeddings are processed locally. | |
| - No data is sent to any server. | |
| - All AI runs client-side. | |
| Your output must include: | |
| - Complete `index.html` file | |
| - Complete `app.js` file | |
| - Any minimal inline CSS (in `index.html`) needed for a clean UI | |
| --- | |
| ## Core Features | |
| Implement the following: | |
| 1. **PDF Upload** | |
| - User can upload a PDF file. | |
| - For performance, limit processing to the **first 3 pages**. | |
| 2. **Chat Interface** | |
| - Input box to ask questions about the uploaded PDF. | |
| - Display chat history (user messages + model responses). | |
| 3. **Context Visualization** | |
| - Show which text chunks were used to answer the question (e.g., as a list or panel). | |
| 4. **Live PDF Preview** | |
| - Display the uploaded PDF pages in a scrollable panel using PDF.js. | |
| 5. **100% Client-Side** | |
| - All processing is done in the browser. | |
| - No network calls for inference or vector search once models are downloaded. | |
| --- | |
| ## Technical Stack | |
| Use exactly this stack: | |
| **Database:** | |
| - **PGLite** (PostgreSQL compiled to WebAssembly, running in the browser) | |
| - **pgvector** extension for vector storage and similarity search | |
| **AI Models (via transformers.js + ONNX Runtime):** | |
| - **Embeddings**: `Xenova/all-MiniLM-L6-v2` (384-dimensional vectors) | |
| - **QA Model**: `Xenova/LaMini-Flan-T5-783M` (generative question answering) | |
| **Libraries:** | |
| - `transformers.js` (ONNX Runtime backend for in-browser ML) | |
| - `PDF.js` (for parsing and rendering PDFs) | |
| **Frontend:** | |
| - Pure HTML, CSS, and JavaScript | |
| - No frontend frameworks (no React, Vue, etc.) | |
| - Use WebGPU acceleration when available (through transformers.js config) | |
| --- | |
| ## UI Layout | |
| Create a **3-column responsive layout**: | |
| - **Left (25%)** | |
| - PDF upload controls | |
| - Chunking strategy selection (dropdown) | |
| - Status messages (e.g., “Embedding…”, “Loading model…”) | |
| - Visualization of retrieved context chunks | |
| - **Middle (25%)** | |
| - Chat interface: | |
| - Scrollable history of messages | |
| - Input box for user questions | |
| - Send button | |
| - Typing indicator during answer generation | |
| - **Right (50%)** | |
| - PDF preview rendered via PDF.js | |
| - Vertical scrolling | |
| - Prevent aggressive auto-scrolling during rendering | |
| Use simple but clean styling (e.g., light theme, clear borders, enough spacing). | |
| --- | |
| ## Chunking Strategies | |
| Implement **6 user-selectable chunking strategies** in JavaScript. | |
| Provide a dropdown in the UI to choose the strategy before processing. | |
| 1. **Standard** | |
| - Fixed-size chunks of ~400 characters | |
| - 50-character overlap between chunks | |
| 2. **Semantic** | |
| - Group sentences by embedding similarity | |
| - Use sentence-level segmentation before grouping | |
| 3. **Outline-based** | |
| - Use the model (LLM) to infer document structure (e.g., headings, sections) | |
| - Chunk text by sections (e.g., heading + following paragraphs) | |
| 4. **Atomic Facts** | |
| - Use the model to extract independent factual statements | |
| - Each fact becomes a separate chunk | |
| 5. **Q&A-oriented** | |
| - Generate one or more questions for each passage | |
| - Store: `{ question, answer_context }` | |
| - Retrieval is based on similarity to the generated questions | |
| 6. **Hybrid** | |
| - Combine outline-based structure with atomic facts | |
| - Use sections as containers and then extract facts per section | |
| **Dynamic sizing:** | |
| Implement logic to target approximately **3 chunks per page** regardless of strategy. | |
| For example, adjust chunk size or number of facts/questions based on total text length per page. | |
| --- | |
| ## Implementation Details | |
| ### Database Setup (PGLite + pgvector) | |
| On page load: | |
| - Initialize PGLite with pgvector enabled. | |
| - Create a table `chunks` if it does not exist: | |
| ```sql | |
| CREATE TABLE IF NOT EXISTS chunks ( | |
| id SERIAL PRIMARY KEY, | |
| text TEXT NOT NULL, | |
| embedding vector(384) NOT NULL, | |
| metadata JSONB | |
| ); | |
| ``` | |
| - Ensure you can clear the `chunks` table when a new document is uploaded. | |
| ### Processing Pipeline | |
| When a PDF is uploaded: | |
| 1. Extract text from the **first 3 pages** using PDF.js. | |
| 2. Apply the selected chunking strategy to produce a list of chunks. | |
| 3. Normalize chunks into a consistent format (see “Data Normalization”). | |
| 4. Embed chunks in **batches of 10** using `all-MiniLM-L6-v2`. | |
| 5. Insert chunks into PGLite with their embeddings and metadata. | |
| ### Query Flow | |
| When the user asks a question: | |
| 1. Embed the user question using `all-MiniLM-L6-v2`. | |
| 2. Run a vector similarity search in PGLite using SQL similar to: | |
| ```sql | |
| SELECT id, text, metadata, | |
| 1 - (embedding <=> $1::vector) AS similarity | |
| FROM chunks | |
| ORDER BY similarity DESC | |
| LIMIT 3; | |
| ``` | |
| 3. Retrieve the top-3 chunks. | |
| 4. Construct a prompt for `LaMini-Flan-T5-783M` that includes: | |
| - The user question | |
| - The retrieved chunks as context | |
| 5. Generate an answer with streaming token-by-token output. | |
| 6. Update the chat UI as tokens arrive. | |
| 7. Show the retrieved chunks in the context visualization area. | |
| For the **Q&A-oriented** strategy: | |
| - Store and retrieve based on the generated questions. | |
| - Embed the stored questions for similarity, but still display the original passage as context. | |
| --- | |
| ## Performance Requirements | |
| Implement the following optimizations: | |
| - **Batch embedding** | |
| - Process chunks in batches of 10. | |
| - Handle errors per batch without breaking the whole pipeline. | |
| - **Model quantization** | |
| - Use quantized models (4-bit or 8-bit) where supported by transformers.js to reduce download sizes. | |
| - **Browser caching** | |
| - Ensure models are cached locally so subsequent loads are significantly faster. | |
| - **Progress indicators** | |
| - Show download progress for each model (percentage or clear textual updates). | |
| - Show status messages during each stage: model loading, embedding, querying, answer generation. | |
| --- | |
| ## UX and Diagnostics | |
| Add the following UX and diagnostic elements: | |
| - Modern, minimal light theme (blue/white is fine). | |
| - Indicator showing whether **WebGPU** is being used or if the app is falling back to CPU. | |
| - Basic memory/usage indicator (even if approximate, like count of chunks). | |
| - Custom scroll styling for the PDF preview panel (simple but visible). | |
| - Typing indicator while the model is generating an answer. | |
| - Avoid jarring auto-scroll when new PDF pages are rendered. | |
| --- | |
| ## Data Normalization and Edge Cases | |
| Implement a helper function, for example: | |
| ```javascript | |
| function normalizeChunks(rawChunks) { | |
| // Input: array of strings or objects | |
| // Output: array of objects with at least: { text, metadata? } | |
| } | |
| ``` | |
| Requirements: | |
| 1. **Chunk Format Consistency** | |
| - Some strategies may produce plain strings. | |
| - Others produce objects (e.g., `{ text, question, facts, sectionTitle }`). | |
| - Normalize everything into a common structure: | |
| `{ text, question?, sectionTitle?, metadata }`. | |
| 2. **Embedding Validation** | |
| - Filter out null/undefined/empty `text` before embedding. | |
| - Log (console) any chunks that are skipped and why. | |
| 3. **Q&A Strategy** | |
| - Embed the generated **question**, not the full passage. | |
| - Store both question and passage in the `metadata` column. | |
| - Retrieval uses question embedding; display passage as context. | |
| 4. **Database Cleanup** | |
| - Clear the `chunks` table when a new document is uploaded. | |
| - Avoid inserting duplicate embeddings if the same document is reprocessed. | |
| 5. **Model Loading Errors** | |
| - Handle failures (e.g., network issues, 401 for gated models). | |
| - Show clear error messages in the UI. | |
| - Fallback gracefully to CPU if WebGPU fails or is unsupported. | |
| --- | |
| ## File Structure | |
| Generate code assuming the following structure: | |
| ``` | |
| in-browser-rag/ | |
| ├── index.html # Main UI and basic styles | |
| ├── app.js # All JavaScript logic | |
| └── favicon.png # App icon (you can reference a placeholder path) | |
| ``` | |
| Assume the user will run the app locally using: | |
| ```bash | |
| python3 -m http.server 8002 | |
| ``` | |
| and open `http://localhost:8002` in the browser. | |
| --- | |
| ## Development Approach for the Agent | |
| As the agent, follow this approach: | |
| 1. **Plan** | |
| - Briefly outline the architecture in comments at the top of `app.js`. | |
| 2. **Implement** | |
| - Generate full, ready-to-run contents of `index.html` and `app.js`. | |
| - Include all necessary `<script>` tags for external libraries (with CDN URLs) in `index.html`. | |
| 3. **Wire Up** | |
| - Ensure all UI controls are connected to their handlers in `app.js`. | |
| - Verify the full flow: | |
| - upload → chunk → embed → store → ask question → retrieve → answer → display. | |
| 4. **Do Not Omit Code** | |
| - Do not describe code in prose. | |
| - Output complete code listings so the user can save them as files and run immediately. | |
| --- | |
| Generate the final answer as: | |
| 1. Full `index.html` content. | |
| 2. Full `app.js` content. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment