Created
February 24, 2026 23:55
-
-
Save bkataru/ba0e7cb4fd43bf1a0ed2a96ed124b99b to your computer and use it in GitHub Desktop.
SKILL: Chrome Extension LevelDB Recovery — rusty_leveldb pitfalls, double-encoded JSON, hot-copy corruption, Edge WAL extraction
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Skill: Chrome Extension LevelDB Recovery | |
| > Recovering data from Chrome/Edge/Brave extension LevelDB stores (e.g., OneTab, 1Password, any browser extension that uses `chrome.storage.local`). | |
| --- | |
| ## Extension Storage Paths | |
| | Browser | Extension LevelDB path | | |
| |---------|----------------------| | |
| | Chrome (Windows) | `%LOCALAPPDATA%\Google\Chrome\User Data\{Profile}\Local Extension Settings\{ext_id}\` | | |
| | Edge (Windows) | `%LOCALAPPDATA%\Microsoft\Edge\User Data\{Profile}\Local Extension Settings\{ext_id}\` | | |
| | Brave (Windows) | `%LOCALAPPDATA%\BraveSoftware\Brave-Browser\User Data\{Profile}\Local Extension Settings\{ext_id}\` | | |
| | Chrome (Linux) | `~/.config/google-chrome/{Profile}/Local Extension Settings/{ext_id}/` | | |
| | Edge (Linux) | `~/.config/microsoft-edge/{Profile}/Local Extension Settings/{ext_id}/` | | |
| Profile is usually `Default`. Check `chrome://version` → Profile Path to confirm. | |
| **Note:** Edge uses a different extension ID than Chrome/Brave for the same extension (e.g., OneTab: Chrome = `chphlpgkkbolifaimnlloiipkdnihall`, Edge = `hoimpamkkoehapgenciaoajfkfkpgfop`). | |
| --- | |
| ## Critical: Values Are Double-Encoded JSON Strings | |
| Chrome extension storage (`chrome.storage.local`) stores values as **JSON-encoded strings**, not raw JSON objects. The LevelDB value for a JSON object like `{"key":"val"}` is stored as the string `"{\"key\":\"val\"}"` — with outer quotes. | |
| In raw bytes: the value starts with `"` (0x22), then `{` (0x7b), then `\` (0x5c). | |
| **In Rust with rusty_leveldb:** | |
| ```rust | |
| // WRONG — silently returns Err for all chrome.storage.local values | |
| serde_json::from_str::<MyStruct>(value_str) | |
| // CORRECT — unwrap the outer JSON string first | |
| let json_to_parse = if value_str.starts_with('"') { | |
| serde_json::from_str::<String>(value_str) | |
| .unwrap_or_else(|_| value_str.to_string()) | |
| } else { | |
| value_str.to_string() | |
| }; | |
| let data: MyStruct = serde_json::from_str(&json_to_parse)?; | |
| ``` | |
| --- | |
| ## Critical: Iterator Must Call `seek_to_first()` Before Loop | |
| `rusty_leveldb` iterators start **unpositioned**. Calling `advance()` before checking `valid()` means the loop body never executes. | |
| ```rust | |
| // WRONG — loop never runs | |
| let mut iter = db.new_iter()?; | |
| iter.advance(); | |
| while iter.valid() { ... } | |
| // CORRECT | |
| let mut iter = db.new_iter()?; | |
| iter.seek_to_first(); | |
| while iter.valid() { | |
| iter.current(&mut key, &mut value); | |
| // process... | |
| iter.advance(); | |
| } | |
| ``` | |
| --- | |
| ## Critical: Copy the LevelDB BEFORE Reading (Browser Must Be Closed) | |
| If the browser is running during `scp` or file copy: | |
| - The MANIFEST file may be mid-write → `Corruption: no meta-lognumber entry in descriptor` | |
| - SSTable block CRCs will mismatch → `Corruption: checksum mismatch` | |
| - Both `rusty_leveldb` and `plyvel` (Python) will fail | |
| - Raw binary extraction yields only fragments (SSTable blocks split data across boundaries) | |
| **Fix:** Fully close the browser (`File > Exit`, not just close window — check Task Manager), then copy. On Windows via SCP: | |
| ```powershell | |
| # Close Chrome/Edge/Brave first, then: | |
| scp -r "C:\Users\user\AppData\Local\Google\Chrome\User Data\Default\Local Extension Settings\chphlpgkkbolifaimnlloiipkdnihall" user@host:/tmp/onetab-chrome | |
| ``` | |
| If `scp` targets an existing directory, the browser folder will be placed *inside* it as a subdirectory (named after the extension ID). Adjust the `--db-path` accordingly. | |
| --- | |
| ## Edge Case: No `.ldb` Files (Data Only in WAL) | |
| Edge (and sometimes Chrome on first run) may not have compacted the write-ahead log yet. The extension directory will contain only a `.log` file (possibly 100KB+) and `CURRENT`, with **no `.ldb` files**. | |
| `rusty_leveldb` handles this transparently — the WAL is replayed into the in-memory DB on open. | |
| If you need to extract raw from the WAL (e.g., tablitz's recover falls back): | |
| ```python | |
| import json, re | |
| with open('000003.log', 'rb') as f: | |
| data = f.read() | |
| # Find the last "state" entry (most complete snapshot) | |
| # The value in WAL has the same double-encoding as LevelDB | |
| pattern = b'"{\\"tabGroups' | |
| idx = data.rfind(pattern) | |
| chunk = data[idx:] | |
| text = chunk.decode('utf-8', errors='replace') | |
| # Walk to find closing `}"` of the outer JSON string | |
| i = 1; esc = False | |
| while i < len(text): | |
| c = text[i] | |
| if esc: esc = False | |
| elif c == '\\': esc = True | |
| elif c == '"': | |
| outer_str = text[:i+1] | |
| # Strip WAL block boundary noise (control chars) | |
| cleaned = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f\x7f\ufffd]', '', outer_str) | |
| inner = json.loads(cleaned) # unwrap outer string | |
| data2 = json.loads(inner) # parse inner JSON | |
| print(f"Groups: {len(data2['tabGroups'])}") | |
| break | |
| i += 1 | |
| ``` | |
| Control characters (`\x0f`, etc.) appear at WAL block boundaries (every 32KB) inside string values — strip them before parsing. | |
| --- | |
| ## Rust Crate Recommendation | |
| Use `rusty_leveldb` (pure Rust, no C deps) over `plyvel` (Python C bindings) or `leveldb` (C++ FFI). It handles the WAL replay and compaction transparently, and works on any platform without needing LevelDB system libraries. | |
| ```toml | |
| [dependencies] | |
| rusty-leveldb = "1" | |
| ``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment