cdeil/agentsview-vscode.md

## agentsview-vscode.md

      
    Raw
  

              agentsview-vscode.md
            
          
    VSCode Copilot Parser - Technical Summary

Implementation notes for the VSCode Copilot chat session parser in agentsview.
What's Implemented

The parser ingests VSCode Copilot chat sessions from local disk, covering:

VSCode Stable (~/Library/Application Support/Code/User/)
VSCode Insiders (~/Library/Application Support/Code - Insiders/User/)
Both .json (pre-v1.109) and .jsonl (v1.109+ default) formats
Workspace sessions (workspaceStorage/<hash>/chatSessions/)
Global sessions (globalStorage/emptyWindowChatSessions/)

How VSCode Stores Sessions

VSCode uses a two-layer storage model:

SQLite index (state.vscdb) -- metadata only (title, timestamps, isEmpty).
Key: chat.ChatSessionStore.index in the ItemTable.
Session files (chatSessions/<uuid>.{json,jsonl}) -- full conversation content.

Each opened workspace gets an MD5-hashed directory under workspaceStorage/.
A workspace.json manifest inside maps the hash back to the human-readable project path.
JSON vs JSONL


Format
Era
Strategy


.json
Pre v1.109
Full JSON rewrite on every save


.jsonl
v1.109+
Append-only operation log (kind=0 initial, kind=1 set, kind=2 push, kind=3 delete)


Both formats produce the same session structure after parsing. When both exist
for the same UUID, the .jsonl version takes priority. After reconstruction,
the top-level schema includes sessionId, creationDate, customTitle,
requests[] with nested message, response[], agent, modelId, and result.
Agent Sessions (Not Implemented)

VSCode's "coding agent" mode stores sessions differently -- only in state.vscdb
under agentSessions.model.cache, not as separate files. Parsing these would
require SQLite extraction and is outside the current scope.
Tool Call Extraction

VSCode Copilot response items include toolInvocationSerialized entries with:

toolId -- the raw tool identifier (e.g., copilot_readFile, copilot_runInTerminal)
invocationMessage -- human-readable description (string or {value: "..."} object)
pastTenseMessage -- past-tense version preferred for display
toolSpecificData -- structured data (e.g., {kind: "terminal", command: "npm test"})

The parser extracts these into InputJSON for frontend display, and normalizes
the raw toolId to a standard category via two mapping steps:
copilot_readFile     -> read_file    -> Read
copilot_runInTerminal -> shell       -> Bash
copilot_replaceString -> edit_file   -> Edit (was Write)
copilot_findTextInFiles -> grep     -> Grep
copilot_listDirectory -> glob       -> Glob
copilot_createFile   -> create_file  -> Write
copilot_fetchWebPage -> read_web_page -> Read
runSubagent          -> Task         -> Task

There are 60+ unique tool IDs observed in practice, including MCP tools
(mcp_dart_sdk_*, mcp_microsoft_pla_*, pgsql_*), GitHub PR tools,
Python environment tools, and various extensions. Unmapped tools fall
through to the Other category.
Mixed Content Handling

When a response contains both tool calls and markdown text, the parser
always includes tool markers in the content (e.g., [Read: copilot_readFile])
followed by the text. This ensures the frontend can detect and render tool
blocks regardless of whether text accompanies them.
How This Differs from Cursor

agentsview already has a separate Cursor parser (internal/parser/cursor.go).
The two are architecturally different:


Aspect
VSCode Copilot
Cursor


Storage location
~/Library/Application Support/Code/User/workspaceStorage/<hash>/chatSessions/
~/.cursor/projects/<project>/agent-transcripts/


File format
JSON or JSONL (VSCode mutation log)
Plain text (role markers) or JSONL (Anthropic API format)


Project mapping
Via workspace.json manifest in hash directory
Via parent directory name


Tool calls
toolInvocationSerialized items in response array
[Tool call] name markers in text


Session index
state.vscdb SQLite
None (files are self-contained)


Config env var
COPILOT_DIR
CURSOR_PROJECTS_DIR


Default path
~/Library/Application Support/Code/User/
~/.cursor/projects


Cursor stores transcripts as plain text files or Anthropic API JSONL,
making them simpler to parse but containing less structured metadata.
VSCode Copilot's structured JSON/JSONL format preserves richer data
(tool invocation details, timing info, model IDs) but requires more
complex reconstruction logic.
Session Inventory (User's Machine)


Edition
JSON
JSONL
Total


Code (stable)
959
71
~1030


Code - Insiders
24
140
~164


Total
983
211
~1194


Before JSONL support, only 920 sessions were discoverable (JSON only).
Implementation Files


File
Purpose


internal/parser/vscode_copilot.go
JSON + JSONL parser, tool normalization


internal/parser/vscode_copilot_test.go
Tests (tool extraction, mixed content, JSONL)


internal/parser/discovery.go
DiscoverVSCodeCopilotSessions()


internal/parser/taxonomy.go
NormalizeToolCategory() shared mapping


internal/parser/types.go
AgentVSCodeCopilot constant


internal/sync/engine.go
processVSCodeCopilot(), file watcher integration


frontend/src/lib/utils/agents.ts
vscode-copilot in KNOWN_AGENTS


frontend/src/lib/utils/content-parser.ts
Tool regex includes Other category


Key Design Decisions


Parse files, not state.vscdb -- the SQLite database is volatile and
undocumented. The JSON/JSONL files are the canonical source.
JSONL reconstruction -- replay mutations into json.RawMessage,
then unmarshal into the same structs used for flat JSON.
Deduplication -- .jsonl wins over .json for the same UUID.
Tool normalization -- two-step mapping (raw ID -> generic name -> category)
using the same taxonomy as all other agents.
Content format -- tool markers always included in message content
so the frontend's regex-based parser can detect them.
Format	Era	Strategy
`.json`	Pre v1.109	Full JSON rewrite on every save
`.jsonl`	v1.109+	Append-only operation log (kind=0 initial, kind=1 set, kind=2 push, kind=3 delete)
Aspect	VSCode Copilot	Cursor
Storage location	`~/Library/Application Support/Code/User/workspaceStorage/<hash>/chatSessions/`	`~/.cursor/projects/<project>/agent-transcripts/`
File format	JSON or JSONL (VSCode mutation log)	Plain text (role markers) or JSONL (Anthropic API format)
Project mapping	Via `workspace.json` manifest in hash directory	Via parent directory name
Tool calls	`toolInvocationSerialized` items in response array	`[Tool call] name` markers in text
Session index	`state.vscdb` SQLite	None (files are self-contained)
Config env var	`COPILOT_DIR`	`CURSOR_PROJECTS_DIR`
Default path	`~/Library/Application Support/Code/User/`	`~/.cursor/projects`
Edition	JSON	JSONL	Total
Code (stable)	959	71	~1030
Code - Insiders	24	140	~164
Total	983	211	~1194
File	Purpose
`internal/parser/vscode_copilot.go`	JSON + JSONL parser, tool normalization
`internal/parser/vscode_copilot_test.go`	Tests (tool extraction, mixed content, JSONL)
`internal/parser/discovery.go`	`DiscoverVSCodeCopilotSessions()`
`internal/parser/taxonomy.go`	`NormalizeToolCategory()` shared mapping
`internal/parser/types.go`	`AgentVSCodeCopilot` constant
`internal/sync/engine.go`	`processVSCodeCopilot()`, file watcher integration
`frontend/src/lib/utils/agents.ts`	`vscode-copilot` in KNOWN_AGENTS
`frontend/src/lib/utils/content-parser.ts`	Tool regex includes `Other` category