Skip to content

Instantly share code, notes, and snippets.

@bkataru
Created February 24, 2026 23:56
Show Gist options
  • Select an option

  • Save bkataru/64d379ca7e5114654dffc97d281daf26 to your computer and use it in GitHub Desktop.

Select an option

Save bkataru/64d379ca7e5114654dffc97d281daf26 to your computer and use it in GitHub Desktop.
SKILL: Rust CLI/Storage QA Gotchas — TOML arrays, format parser header vs label, MCP stdout, LevelDB iterator
# Skill: Rust CLI & Storage QA Gotchas
> Concrete bugs found during dogfood QA of a Rust CLI tool with SQLite, TOML, LevelDB, and MCP. Each pattern recurs in other projects.
---
## 1. TOML Does Not Support Top-Level Arrays
`toml::to_string(&Vec<T>)` returns `Err("unsupported rust type")` — TOML requires a top-level table, not an array.
```rust
// WRONG
let output = toml::to_string(&groups)?; // Err!
// CORRECT — wrap in a table struct
#[derive(serde::Serialize)]
struct TomlRoot<'a> { groups: &'a [TabGroup] }
let output = toml::to_string(&TomlRoot { groups: &groups })?;
```
The resulting TOML:
```toml
[[groups]]
id = "abc"
# ...
```
---
## 2. Distinguish UI Header Text From User Data in Format Parsers
When parsing a structured text format (Markdown, HTML, etc.), UI-generated header text and user-entered data can look identical. Read the **generating code**, not just the output.
**Example (OneTab markdown export):**
```markdown
## 8 tabs ← generated by JS: `## ${tabCount}` — NOT a user label
> Created 3/20/2025, 10:08:46 PM
> My Research ← user-entered label (second > line)
[Tab Title](https://url)
```
A naive parser that reads `## text` as the group label produces labels like `"8 tabs"`, `"2 tabs"` — valid text, wrong semantics. Always check: does this format have a spec or generating script?
**Fix pattern:** Parse the header for its structural role (count, sentinel) but don't use its text as user-facing data unless the spec says it is.
---
## 3. MCP Servers Must Not Write to stdout
Any MCP server using stdio transport (`rmcp`, `mcp-rs`, or manual JSON-RPC) will break if the server process writes non-JSON-RPC content to stdout. This includes startup messages, debug prints, and `println!` calls in library code.
```rust
// WRONG — breaks MCP JSON-RPC protocol
println!("Starting tablitz MCP server...");
// CORRECT
eprintln!("Starting tablitz MCP server...");
```
Rule: in any binary that serves a stdio protocol, **grep for `println!` and replace with `eprintln!`** before shipping.
Same applies to Python (`print()` → `sys.stderr.write()`), Node.js (`console.log()` → `console.error()`), etc.
---
## 4. rusty_leveldb Iterator: `seek_to_first()` Not `advance()`
See: SKILL: Chrome Extension LevelDB Recovery.
Short version: iterators start unpositioned. `advance()` before `valid()` → loop never runs. Use `seek_to_first()`.
---
## 5. LevelDB Values in Chrome Extensions Are Double-Encoded
`chrome.storage.local` stores JSON values as JSON-encoded strings. The raw bytes start with `"{\` not `{`. Parse as `String` first, then parse the inner JSON. See: SKILL: Chrome Extension LevelDB Recovery.
---
## 6. Test Isolation: In-Proc Tests Sharing Global State
When unit tests in the same crate run in parallel (default), tests that write to a shared file path or `~/.local/share/...` will race.
**Fix:** Use `tempfile::tempdir()` for every test that touches the filesystem. Never hard-code paths in tests.
```rust
let dir = tempfile::tempdir().unwrap();
let db = Store::open(&dir.path().join("test.db")).await.unwrap();
// dir is dropped (and deleted) at end of test
```
---
## 7. Async Test Runtime
`tokio::test` defaults to a current-thread runtime. For tests that use `spawn` or need multi-thread behavior:
```rust
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn test_concurrent_inserts() { ... }
```
For simple async tests, the default is fine:
```rust
#[tokio::test]
async fn test_basic() { ... }
```
In `Cargo.toml` dev-dependencies, you need both:
```toml
[dev-dependencies]
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }
```
---
## 8. `INSERT OR IGNORE` Is Your Friend for Idempotent Imports
For any system that imports data that may already exist, use `INSERT OR IGNORE` with a stable primary key (content-derived ID or natural key) rather than `INSERT OR REPLACE` (which silently overwrites) or bare `INSERT` (which errors on duplicate).
```sql
INSERT OR IGNORE INTO tab_groups (id, label, created_at, ...) VALUES (?, ?, ?, ...)
```
Pair with FNV-1a or SHA-256 hashing of stable content (file path + position + content) to generate deterministic IDs for data that lacks a natural key.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment