Implementation notes for the VSCode Copilot chat session parser in agentsview.
The parser ingests VSCode Copilot chat sessions from local disk, covering:
- VSCode Stable (
~/Library/Application Support/Code/User/) - VSCode Insiders (
~/Library/Application Support/Code - Insiders/User/) - Both
.json(pre-v1.109) and.jsonl(v1.109+ default) formats - Workspace sessions (
workspaceStorage/<hash>/chatSessions/) - Global sessions (
globalStorage/emptyWindowChatSessions/)
VSCode uses a two-layer storage model:
- SQLite index (
state.vscdb) -- metadata only (title, timestamps, isEmpty). Key:chat.ChatSessionStore.indexin theItemTable. - Session files (
chatSessions/<uuid>.{json,jsonl}) -- full conversation content.
Each opened workspace gets an MD5-hashed directory under workspaceStorage/.
A workspace.json manifest inside maps the hash back to the human-readable project path.
| Format | Era | Strategy |
|---|---|---|
.json |
Pre v1.109 | Full JSON rewrite on every save |
.jsonl |
v1.109+ | Append-only operation log (kind=0 initial, kind=1 set, kind=2 push, kind=3 delete) |
Both formats produce the same session structure after parsing. When both exist
for the same UUID, the .jsonl version takes priority. After reconstruction,
the top-level schema includes sessionId, creationDate, customTitle,
requests[] with nested message, response[], agent, modelId, and result.
VSCode's "coding agent" mode stores sessions differently -- only in state.vscdb
under agentSessions.model.cache, not as separate files. Parsing these would
require SQLite extraction and is outside the current scope.
VSCode Copilot response items include toolInvocationSerialized entries with:
toolId-- the raw tool identifier (e.g.,copilot_readFile,copilot_runInTerminal)invocationMessage-- human-readable description (string or{value: "..."}object)pastTenseMessage-- past-tense version preferred for displaytoolSpecificData-- structured data (e.g.,{kind: "terminal", command: "npm test"})
The parser extracts these into InputJSON for frontend display, and normalizes
the raw toolId to a standard category via two mapping steps:
copilot_readFile -> read_file -> Read
copilot_runInTerminal -> shell -> Bash
copilot_replaceString -> edit_file -> Edit (was Write)
copilot_findTextInFiles -> grep -> Grep
copilot_listDirectory -> glob -> Glob
copilot_createFile -> create_file -> Write
copilot_fetchWebPage -> read_web_page -> Read
runSubagent -> Task -> Task
There are 60+ unique tool IDs observed in practice, including MCP tools
(mcp_dart_sdk_*, mcp_microsoft_pla_*, pgsql_*), GitHub PR tools,
Python environment tools, and various extensions. Unmapped tools fall
through to the Other category.
When a response contains both tool calls and markdown text, the parser
always includes tool markers in the content (e.g., [Read: copilot_readFile])
followed by the text. This ensures the frontend can detect and render tool
blocks regardless of whether text accompanies them.
agentsview already has a separate Cursor parser (internal/parser/cursor.go).
The two are architecturally different:
| Aspect | VSCode Copilot | Cursor |
|---|---|---|
| Storage location | ~/Library/Application Support/Code/User/workspaceStorage/<hash>/chatSessions/ |
~/.cursor/projects/<project>/agent-transcripts/ |
| File format | JSON or JSONL (VSCode mutation log) | Plain text (role markers) or JSONL (Anthropic API format) |
| Project mapping | Via workspace.json manifest in hash directory |
Via parent directory name |
| Tool calls | toolInvocationSerialized items in response array |
[Tool call] name markers in text |
| Session index | state.vscdb SQLite |
None (files are self-contained) |
| Config env var | COPILOT_DIR |
CURSOR_PROJECTS_DIR |
| Default path | ~/Library/Application Support/Code/User/ |
~/.cursor/projects |
Cursor stores transcripts as plain text files or Anthropic API JSONL, making them simpler to parse but containing less structured metadata. VSCode Copilot's structured JSON/JSONL format preserves richer data (tool invocation details, timing info, model IDs) but requires more complex reconstruction logic.
| Edition | JSON | JSONL | Total |
|---|---|---|---|
| Code (stable) | 959 | 71 | ~1030 |
| Code - Insiders | 24 | 140 | ~164 |
| Total | 983 | 211 | ~1194 |
Before JSONL support, only 920 sessions were discoverable (JSON only).
| File | Purpose |
|---|---|
internal/parser/vscode_copilot.go |
JSON + JSONL parser, tool normalization |
internal/parser/vscode_copilot_test.go |
Tests (tool extraction, mixed content, JSONL) |
internal/parser/discovery.go |
DiscoverVSCodeCopilotSessions() |
internal/parser/taxonomy.go |
NormalizeToolCategory() shared mapping |
internal/parser/types.go |
AgentVSCodeCopilot constant |
internal/sync/engine.go |
processVSCodeCopilot(), file watcher integration |
frontend/src/lib/utils/agents.ts |
vscode-copilot in KNOWN_AGENTS |
frontend/src/lib/utils/content-parser.ts |
Tool regex includes Other category |
- Parse files, not state.vscdb -- the SQLite database is volatile and undocumented. The JSON/JSONL files are the canonical source.
- JSONL reconstruction -- replay mutations into
json.RawMessage, then unmarshal into the same structs used for flat JSON. - Deduplication --
.jsonlwins over.jsonfor the same UUID. - Tool normalization -- two-step mapping (raw ID -> generic name -> category) using the same taxonomy as all other agents.
- Content format -- tool markers always included in message content so the frontend's regex-based parser can detect them.