Skip to content

Instantly share code, notes, and snippets.

@alltheseas
Created January 5, 2026 22:10
Show Gist options
  • Select an option

  • Save alltheseas/3517667d99cf3ecb2f16573ca36ccebb to your computer and use it in GitHub Desktop.

Select an option

Save alltheseas/3517667d99cf3ecb2f16573ca36ccebb to your computer and use it in GitHub Desktop.
NostrDB Snapshot Storage Optimization - Technical Review (Updated 2026-01-05)

NostrDB Snapshot Storage Optimization Analysis

Issue: GitHub #3502 Beads: optimize-storage-ndb-snapshots-ct6 Date: 2026-01-05

Executive Summary

PR #3468 fixed critical 0xdead10cc crashes by moving the main NostrDB database to the app's private container and creating periodic snapshots to a shared container for extension access. The tradeoff is doubled storage since the entire database is copied.

This document analyzes 5 approaches to reduce storage overhead while maintaining extension functionality.


Current State

What Extensions Actually Need

Extension Data Required Access Pattern
Notification Service Profiles, Mutelists, Contacts, Settings Read-only lookup by pubkey
Share Extension Keypair, Settings, Relay info Minimal reads, posts to relays
Highlighter Extension Keypair, Settings Minimal reads, posts to relays

Current Database Structure (17 Sub-databases)

REQUIRED BY EXTENSIONS:
├── NDB_DB_PROFILE          ← Profiles (names, pictures, nip05)
├── NDB_DB_PROFILE_PK       ← Profile pubkey index
├── NDB_DB_NOTE             ← Events (for mute list items: kind 10000)
├── NDB_DB_NOTE_PUBKEY_KIND ← Author+kind index (to find mute lists)

NOT REQUIRED BY EXTENSIONS:
├── NDB_DB_META             ← Note metadata (reply counts, etc.)
├── NDB_DB_NOTE_ID          ← Event ID index
├── NDB_DB_NDB_META         ← System metadata
├── NDB_DB_PROFILE_SEARCH   ← Fulltext search (large)
├── NDB_DB_PROFILE_LAST_FETCH
├── NDB_DB_NOTE_KIND        ← Kind index
├── NDB_DB_NOTE_TEXT        ← Fulltext search (very large)
├── NDB_DB_NOTE_BLOCKS      ← Parsed blocks
├── NDB_DB_NOTE_TAGS        ← Tags index
├── NDB_DB_NOTE_PUBKEY      ← Author index
├── NDB_DB_NOTE_RELAY_KIND  ← Relay+kind index
├── NDB_DB_NOTE_RELAYS      ← Note-relay mappings

Estimated Storage Breakdown:

  • Fulltext search indexes (NOTE_TEXT, PROFILE_SEARCH): ~40-60% of total
  • Notes and metadata: ~30-40%
  • Profiles and indexes: ~10-20%

Approach Analysis

Approach 1: Selective LMDB Sub-database Copy

Beads: optimize-storage-ndb-snapshots-hf9

Copy only required LMDB sub-databases instead of all 17 tables.

Aspect Analysis
Benefits • 70-80% storage reduction
• Native LMDB format retained
• Existing extension code unchanged
• ACID guarantees preserved
Risks • Requires C-level nostrdb modifications
• Must maintain DB consistency across selected tables
• Foreign key-like relationships may break
• More complex testing
Complexity HIGH - New C function ndb_selective_snapshot() needed
Storage Savings ~70-80%
Main Thread Safe - already runs on background queue

Implementation Sketch:

int ndb_selective_snapshot(struct ndb *ndb, const char *path,
                           enum ndb_dbs *dbs, int num_dbs);

Code Quality Concerns:

  • Adds complexity to nostrdb C layer
  • Requires careful transaction handling
  • May need nostrdb version coordination

Approach 2: Separate Lightweight Extension Database

Beads: optimize-storage-ndb-snapshots-i6a

Create a dedicated small database that only stores extension-needed data. Main app writes to both.

Aspect Analysis
Benefits • Clear separation of concerns
• Extensions get purpose-built data store
• Can use simpler format (SQLite/JSON)
• No nostrdb C changes needed
Risks • Dual-write complexity in main app
• Data consistency between databases
• Additional codebase to maintain
• Schema evolution challenges
Complexity MEDIUM-HIGH
Storage Savings ~80-90%
Main Thread Must ensure writes are background-dispatched

Implementation Approach:

actor ExtensionDataStore {
    /// Writes profile to extension database (background)
    func updateProfile(_ profile: Profile, pubkey: Pubkey) async

    /// Writes mute list (background)
    func updateMuteList(_ items: [MuteItem], pubkey: Pubkey) async
}

Code Quality Concerns:

  • Adds new subsystem to maintain
  • Dual-write logic must be bulletproof
  • Good: Swift-native, testable, reviewable

Approach 3: JSON/Plist Data Export

Beads: optimize-storage-ndb-snapshots-w5i

Export needed data to JSON files in the shared container.

Aspect Analysis
Benefits • SIMPLEST implementation
• No database dependencies in extensions
• Human-readable/debuggable
• Easy to version and migrate
• Pure Swift implementation
Risks • O(n) lookup performance (no indexes)
• Memory pressure with large datasets
• Must load entire file to access
• No ACID guarantees
Complexity LOW
Storage Savings ~85-95%
Main Thread Must use background encoding/writing

Data Files:

group.com.damus/extension_data/
├── profiles.json      (~500KB-2MB for active users)
├── mute_list.json     (~10KB-100KB)
├── contacts.json      (~50KB-500KB)
└── metadata.json      (~1KB)

Implementation:

struct ExtensionData: Codable {
    let profiles: [String: ProfileData]  // pubkey -> profile
    let muteList: MuteListData
    let contacts: [String]  // followed pubkeys
    let lastUpdated: Date
}

actor ExtensionDataExporter {
    func exportToSharedContainer() async throws {
        let data = await gatherExtensionData()
        let encoder = JSONEncoder()
        let jsonData = try encoder.encode(data)
        try jsonData.write(to: sharedContainerURL)
    }
}

Code Quality Assessment:

  • ✅ Simple, readable, reviewable
  • ✅ Follows nevernesting (early returns)
  • ✅ Easy to test
  • ⚠️ O(n) lookups may be slow for large profile sets

Approach 4: Incremental/Delta Snapshots

Beads: optimize-storage-ndb-snapshots-ibb

Only copy changed data since last snapshot.

Aspect Analysis
Benefits • Minimal I/O per snapshot
• Reduced battery/CPU impact
• Could enable more frequent updates
Risks • LMDB doesn't natively support deltas
• Complex change tracking needed
• Corruption recovery harder
• Snapshot consistency harder to verify
Complexity VERY HIGH
Storage Savings Variable (0-80% depending on churn)
Main Thread Safe if implemented correctly

Implementation Challenges:

  • LMDB uses copy-on-write but doesn't expose page-level change tracking
  • Would need to track changes at application level
  • Merge logic for applying deltas is complex

Code Quality Concerns:

  • Very complex implementation
  • Hard to review and verify correctness
  • Against "simplicity" principle

Recommendation: NOT RECOMMENDED due to complexity vs. benefit ratio.


Approach 5: Compressed Snapshots

Beads: optimize-storage-ndb-snapshots-i3j

Compress snapshot data before writing, decompress on read.

Aspect Analysis
Benefits • 50-70% storage reduction
• Minimal code changes
• Works with current full-copy approach
Risks • CPU overhead on compress/decompress
• Extensions must decompress before use
• LMDB can't memory-map compressed files
• Startup latency in extensions
Complexity MEDIUM
Storage Savings ~50-70%
Main Thread Compression MUST be background

Implementation:

func snapshot(path: String) throws {
    let tempPath = path + ".tmp"
    try ndb.snapshot(path: tempPath)

    // Compress on background queue
    Task.detached(priority: .utility) {
        let data = try Data(contentsOf: URL(fileURLWithPath: tempPath))
        let compressed = try (data as NSData).compressed(using: .lzfse)
        try compressed.write(to: URL(fileURLWithPath: path + ".lzfse"))
        try FileManager.default.removeItem(atPath: tempPath)
    }
}

Code Quality Concerns:

  • Adds complexity without addressing root issue (still copying everything)
  • Extensions need decompress step before LMDB can open
  • May not work well with LMDB's memory-mapping

Comparison Matrix

Approach Storage Savings Complexity nostrdb Changes Risk Level Recommendation
1. Selective LMDB 70-80% High Yes (C) Medium Consider if C changes acceptable
2. Separate DB 80-90% Medium-High No Medium Good option, more maintenance
3. JSON Export 85-95% Low No Low RECOMMENDED
4. Incremental Variable Very High Yes High Not recommended
5. Compressed 50-70% Medium No Low Doesn't solve core issue

Recommendation: Approach 3 (JSON Export)

Rationale:

  1. Simplicity - Follows the "code should tend toward simplicity" principle
  2. No nostrdb changes - Avoids C-level modifications
  3. Human readable - Easy to debug and review
  4. Testable - Pure Swift, straightforward unit tests
  5. Maximum savings - 85-95% reduction
  6. Low risk - JSON is well-understood, no complex edge cases

Performance Mitigation for O(n) Lookups:

For notification service (the primary extension):

  • Profile lookups are typically for ~1-10 profiles per notification
  • With 5000 profiles, dictionary lookup is O(1) after initial load
  • Initial load happens once when extension starts
  • Memory: ~2MB for 5000 profiles is acceptable for extensions

Implementation Plan:

  1. Create ExtensionDataExporter actor
  2. Export on each snapshot interval (1 hour)
  3. Export data:
    • Profiles dictionary (pubkey → {name, display_name, picture, nip05})
    • Mute list (users, hashtags, words, threads)
    • Following list (pubkey set)
  4. Extensions load JSON on startup
  5. Remove current full database snapshot

Alternative: Hybrid Approach (2 + 3)

If JSON performance becomes an issue:

  1. Use SQLite for extension data (better than JSON for large datasets)
  2. Simple schema: profiles, mute_items, contacts tables
  3. Main app writes to SQLite on profile/mute/contact changes
  4. Extensions read from SQLite

This adds some complexity but provides indexed lookups if needed.


Next Steps

  1. Decide on approach (recommend: 3 - JSON Export)
  2. Create detailed implementation plan
  3. Implement and test
  4. Measure actual storage savings
  5. Performance test with realistic data sizes

Technical Review: NostrDB Snapshot Storage Optimization

Issue: GitHub #3502 | Beads: optimize-storage-ndb-snapshots-ct6


Current State Analysis

Problem Statement

PR #3468 fixed 0xdead10cc crashes by copying the entire NostrDB (17 sub-databases) to the shared container every hour. This doubles storage usage when extensions only need ~5% of the data.

What Extensions Actually Access

Extension Data Accessed Access Pattern
NotificationService Profiles, MuteList, Contacts Read-only lookup by pubkey
ShareExtension Keypair, Settings Minimal DB reads
HighlighterExtension Keypair, Settings Minimal DB reads

Exact Data Requirements

1. Profiles (for NotificationService)

// Fields used from NdbProfile:
struct ProfileData {
    let name: String?
    let display_name: String?
    let picture: String?       // For notification avatar
    let nip05: String?         // For verification badge
}
// Lookup: profiles.lookup(id: pubkey) -> Profile?
// Typical access: 1-10 profiles per notification

2. Mute List (for NotificationService)

// Four categories stored in MutelistManager:
var users: Set<MuteItem>     // ["p", pubkey_hex, expiration?]
var hashtags: Set<MuteItem>  // ["t", hashtag, expiration?]
var threads: Set<MuteItem>   // ["e", note_id_hex, expiration?]
var words: Set<MuteItem>     // ["word", phrase, expiration?]

// MuteItem enum:
enum MuteItem: Hashable {
    case user(Pubkey, Date?)
    case hashtag(Hashtag, Date?)
    case word(String, Date?)
    case thread(NoteId, Date?)
}

3. Contacts/Follows (for NotificationService)

// Core data structure:
class Contacts {
    private var friends: Set<Pubkey>  // All followed pubkeys
    // Used for: notification_only_from_following setting
}
// Source: Kind 3 event with ["p", pubkey] tags

Current Database Size Breakdown (Estimated)

Sub-database Purpose Est. Size % Needed by Extensions
NDB_DB_NOTE_TEXT Fulltext search 30-40% No
NDB_DB_PROFILE_SEARCH Profile search 10-15% No
NDB_DB_NOTE All notes 20-30% No (except mute list event)
NDB_DB_NOTE_BLOCKS Parsed content 5-10% No
NDB_DB_META Note metadata 5-10% No
NDB_DB_PROFILE Profile data 5-10% Yes
NDB_DB_PROFILE_PK Profile index 1-2% Yes
Other indexes Various 5-10% No

Conclusion: Extensions need ~10-15% of the database, but we copy 100%.


Approach 1: Selective LMDB Sub-database Copy

Beads: optimize-storage-ndb-snapshots-hf9

Concept

Modify nostrdb at the C level to copy only specific LMDB sub-databases instead of the entire environment.

Implementation

C-Level Changes (nostrdb.c)

/// Copy only selected sub-databases to a new LMDB environment
/// @param ndb The source nostrdb instance
/// @param path Destination path for the selective snapshot
/// @param dbs Array of database IDs to copy
/// @param num_dbs Number of databases in the array
/// @return 0 on success, error code on failure
int ndb_selective_snapshot(struct ndb *ndb, const char *path,
                           enum ndb_dbs *dbs, int num_dbs) {
    MDB_env *dst_env;
    MDB_txn *src_txn, *dst_txn;
    int rc;

    // Create destination environment
    rc = mdb_env_create(&dst_env);
    if (rc != 0) return rc;

    rc = mdb_env_set_maxdbs(dst_env, num_dbs);
    if (rc != 0) { mdb_env_close(dst_env); return rc; }

    // Set mapsize large enough for selective data (default 10MB is too small)
    // 512MB should be sufficient for profiles + indexes
    rc = mdb_env_set_mapsize(dst_env, 512ULL * 1024 * 1024);
    if (rc != 0) { mdb_env_close(dst_env); return rc; }

    rc = mdb_env_open(dst_env, path, MDB_NOSUBDIR, 0644);
    if (rc != 0) { mdb_env_close(dst_env); return rc; }

    // Begin transactions
    rc = mdb_txn_begin(ndb->lmdb.env, NULL, MDB_RDONLY, &src_txn);
    if (rc != 0) { mdb_env_close(dst_env); return rc; }

    rc = mdb_txn_begin(dst_env, NULL, 0, &dst_txn);
    if (rc != 0) {
        mdb_txn_abort(src_txn);
        mdb_env_close(dst_env);
        return rc;
    }

    // Copy each selected database
    for (int i = 0; i < num_dbs; i++) {
        rc = copy_single_db(src_txn, dst_txn, ndb, dbs[i]);
        if (rc != 0) break;
    }

    if (rc == 0) {
        mdb_txn_commit(dst_txn);
    } else {
        mdb_txn_abort(dst_txn);
    }

    mdb_txn_abort(src_txn);
    mdb_env_close(dst_env);
    return rc;
}

Swift Interface (Ndb.swift)

extension Ndb {
    /// Creates a selective snapshot containing only extension-required data
    func selectiveSnapshot(path: String) throws {
        // Databases needed by extensions
        let requiredDbs: [ndb_dbs] = [
            NDB_DB_PROFILE,
            NDB_DB_PROFILE_PK,
            NDB_DB_NOTE,        // For mute list events only
            NDB_DB_NOTE_ID,
            NDB_DB_NOTE_PUBKEY_KIND  // To query mute lists by kind
        ]

        try withNdb {
            var dbs = requiredDbs.map { $0.rawValue }
            let rc = ndb_selective_snapshot(
                self.ndb.ndb,
                path,
                &dbs,
                Int32(dbs.count)
            )
            guard rc == 0 else {
                throw SnapshotError.selectiveCopyFailed(errno: rc)
            }
        }
    }
}

Tradeoffs

Aspect Assessment
Storage Savings 70-80% reduction
Implementation Complexity HIGH - Requires C expertise
Code Maintainability Medium - New C code to maintain
Risk Level Medium - Must ensure DB consistency
Performance Impact Good - Native LMDB operations
Extension Code Changes None - Same LMDB format
Testing Complexity High - Need to verify all edge cases

Benefits

  • Maximum storage efficiency while keeping LMDB format
  • Extensions use identical code paths
  • ACID guarantees preserved
  • No serialization/deserialization overhead

Risks

  • Requires deep LMDB knowledge
  • Must handle cross-database references correctly
  • nostrdb version coordination needed
  • Harder to review (C code)

Complexity Breakdown

  • New C function: ~150-200 lines
  • Swift wrapper: ~30 lines
  • Tests: ~200 lines
  • Documentation: Required

Verdict

Consider if C changes are acceptable and team has LMDB expertise.


Approach 2: Separate Lightweight Extension Database

Beads: optimize-storage-ndb-snapshots-i6a

Concept

Create a dedicated small database (SQLite recommended) that only stores extension-needed data. Main app writes to both databases.

Implementation

Data Model (ExtensionData.swift)

/// Lightweight data store for extension access
actor ExtensionDataStore {
    private let dbPath: URL
    private var db: OpaquePointer?

    init() throws {
        guard let containerURL = FileManager.default
            .containerURL(forSecurityApplicationGroupIdentifier: "group.com.damus")
        else {
            throw ExtensionDataError.containerUnavailable
        }
        self.dbPath = containerURL.appendingPathComponent("extension_data.sqlite")
        try openDatabase()
        try createSchema()
    }

    private func createSchema() throws {
        let schema = """
            CREATE TABLE IF NOT EXISTS profiles (
                pubkey TEXT PRIMARY KEY,
                name TEXT,
                display_name TEXT,
                picture TEXT,
                nip05 TEXT,
                updated_at INTEGER
            );

            CREATE TABLE IF NOT EXISTS mute_items (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                type TEXT NOT NULL,  -- 'user', 'hashtag', 'word', 'thread'
                value TEXT NOT NULL,
                expiration INTEGER,
                UNIQUE(type, value)
            );

            CREATE TABLE IF NOT EXISTS contacts (
                pubkey TEXT PRIMARY KEY
            );

            CREATE TABLE IF NOT EXISTS metadata (
                key TEXT PRIMARY KEY,
                value TEXT
            );
        """
        try execute(schema)
    }
}

Profile Sync (ProfileSyncService.swift)

/// Syncs profile updates to extension database
class ProfileSyncService {
    private let extensionStore: ExtensionDataStore
    private let ndb: Ndb

    /// Called when a profile is updated in the main database
    func syncProfile(_ pubkey: Pubkey) async {
        guard let profile = try? ndb.lookup_profile_and_copy(pubkey) else { return }

        await extensionStore.upsertProfile(
            pubkey: pubkey.hex(),
            name: profile.name,
            displayName: profile.display_name,
            picture: profile.picture,
            nip05: profile.nip05
        )
    }

    /// Batch sync all profiles (for initial setup or recovery)
    func syncAllProfiles() async {
        // Query all profile keys from ndb
        // Batch insert into extension database
    }
}

Extension Access (NotificationExtensionState.swift changes)

struct NotificationExtensionState {
    let extensionData: ExtensionDataStore  // New: lightweight store
    // Remove: let ndb: Ndb

    init?() {
        guard let store = try? ExtensionDataStore() else { return nil }
        self.extensionData = store
        // ... rest of init
    }

    func lookupProfile(_ pubkey: Pubkey) async -> ExtensionProfile? {
        return await extensionData.getProfile(pubkey.hex())
    }

    func isMuted(_ pubkey: Pubkey) async -> Bool {
        return await extensionData.isMutedUser(pubkey.hex())
    }
}

Tradeoffs

Aspect Assessment
Storage Savings 85-95% reduction
Implementation Complexity MEDIUM-HIGH
Code Maintainability Medium - New subsystem
Risk Level Medium - Dual-write consistency
Performance Impact Good - SQLite is fast for small data
Extension Code Changes Significant - New data access layer
Testing Complexity Medium - Standard SQLite testing

Benefits

  • Clear separation of concerns
  • Extensions get purpose-built data store
  • Can optimize schema for extension queries
  • No nostrdb C changes needed
  • SQLite is well-understood

Risks

  • Dual-write complexity (must keep both DBs in sync)
  • Data consistency between databases
  • Additional codebase to maintain
  • Schema evolution challenges
  • Must handle sync failures gracefully

Complexity Breakdown

  • ExtensionDataStore: ~300 lines
  • Sync services: ~200 lines
  • Extension changes: ~100 lines
  • Tests: ~300 lines
  • Migration: ~50 lines

Verdict

Good option if willing to maintain separate data layer.


Approach 3: JSON/Plist Data Export

Beads: optimize-storage-ndb-snapshots-w5i

Concept

Export needed data to JSON files in the shared container. Extensions load JSON on startup. Simplest possible implementation.

Implementation

Data Structures (ExtensionData.swift)

/// Codable structure for extension data export
/// Note: contacts uses [String] instead of Set<String> for deterministic JSON ordering.
/// The reader converts this back to Set<String> for O(1) lookup performance.
struct ExtensionDataExport: Codable {
    let profiles: [String: ProfileExport]  // pubkey -> profile
    let muteList: MuteListExport
    let contacts: [String]                 // followed pubkeys (sorted for deterministic output)
    let exportedAt: Date
    let version: Int = 1
}

struct ProfileExport: Codable {
    let name: String?
    let displayName: String?
    let picture: String?
    let nip05: String?
}

struct MuteListExport: Codable {
    let users: [MuteItemExport]
    let hashtags: [MuteItemExport]
    let words: [MuteItemExport]
    let threads: [MuteItemExport]
}

struct MuteItemExport: Codable {
    let value: String
    let expiration: Date?
}

Exporter (ExtensionDataExporter.swift)

/// Exports extension data to JSON in shared container
actor ExtensionDataExporter {
    private let ndb: Ndb
    private let contacts: Contacts
    private let mutelistManager: MutelistManager

    private static let exportPath: URL? = {
        FileManager.default
            .containerURL(forSecurityApplicationGroupIdentifier: "group.com.damus")?
            .appendingPathComponent("extension_data.json")
    }()

    /// Export all extension data to JSON
    /// Called on snapshot interval (1 hour) or on significant changes
    func exportData() async throws {
        guard let exportPath = Self.exportPath else {
            throw ExportError.containerUnavailable
        }

        let data = ExtensionDataExport(
            profiles: await gatherProfiles(),
            muteList: gatherMuteList(),
            contacts: gatherContacts(),
            exportedAt: Date()
        )

        let encoder = JSONEncoder()
        encoder.dateEncodingStrategy = .iso8601
        encoder.outputFormatting = [.sortedKeys]  // Deterministic output

        let jsonData = try encoder.encode(data)
        try jsonData.write(to: exportPath, options: .atomic)

        Log.info("Exported extension data: %d profiles, %d contacts",
                 for: .storage, data.profiles.count, data.contacts.count)
    }

    private func gatherProfiles() async -> [String: ProfileExport] {
        var profiles: [String: ProfileExport] = [:]

        // Export profiles for:
        // 1. All contacts (followed users)
        // 2. Muted users (need name for "Muted user" display)
        // 3. Recent notification senders (mentions from non-followed users)
        //    - Query recent kind 1/4/7/9735 events that tag our pubkey
        //    - Extract sender pubkeys from last N events or last 24h
        //    - This prevents raw pubkey display in notifications
        // 4. Users who have interacted with us recently (replies, zaps)

        let relevantPubkeys = gatherRelevantPubkeys()

        for pubkey in relevantPubkeys {
            guard let profile = try? ndb.lookup_profile_and_copy(pubkey) else {
                continue
            }
            profiles[pubkey.hex()] = ProfileExport(
                name: profile.name,
                displayName: profile.display_name,
                picture: profile.picture,
                nip05: profile.nip05
            )
        }

        return profiles
    }

    /// Gather all pubkeys whose profiles should be exported
    private func gatherRelevantPubkeys() -> Set<Pubkey> {
        var pubkeys = Set<Pubkey>()

        // 1. All followed users
        pubkeys.formUnion(contacts.get_friend_list())

        // 2. Muted users (for display purposes)
        for item in mutelistManager.users {
            if case .user(let pk, _) = item {
                pubkeys.insert(pk)
            }
        }

        // 3. Recent notification senders (prevents raw pubkey in notifications)
        // Query events that mention our pubkey from the last 24 hours
        let recentMentionSenders = queryRecentMentionSenders(
            ourPubkey: keypair.pubkey,
            since: Date().addingTimeInterval(-24 * 60 * 60)
        )
        pubkeys.formUnion(recentMentionSenders)

        return pubkeys
    }

    /// Query pubkeys of users who mentioned us recently
    ///
    /// Performance expectations:
    /// - Uses NDB_DB_NOTE_TAGS index (indexed by tag type + value)
    /// - Query: filter by "p" tag = ourPubkey, created_at > since
    /// - Expected: O(log n) index lookup + O(k) scan where k = matching events
    /// - Typical k: 100-1000 events in 24h for active users
    /// - Estimated time: <100ms for most users
    ///
    /// Data source: On-disk LMDB index, not in-memory cache.
    /// This runs during export (background thread), not in extension.
    private func queryRecentMentionSenders(ourPubkey: Pubkey, since: Date) -> [Pubkey] {
        let filter = NostrFilter(
            pubkeys: [ourPubkey],  // Events tagging us
            since: UInt32(since.timeIntervalSince1970),
            limit: 1000  // Cap to prevent runaway queries
        )

        guard let noteKeys = try? ndb.query(filters: [filter], maxResults: 1000) else {
            return []
        }

        var senders = Set<Pubkey>()
        for noteKey in noteKeys {
            try? ndb.lookup_note_by_key(noteKey) { note in
                if let note {
                    senders.insert(note.pubkey)
                }
            }
        }

        return Array(senders)
    }

    private func gatherMuteList() -> MuteListExport {
        // Use compactMap to gracefully skip malformed items instead of crashing
        return MuteListExport(
            users: mutelistManager.users.compactMap { item in
                guard case .user(let pk, let exp) = item else {
                    Log.warning("Unexpected mute item type in users set", for: .storage)
                    return nil
                }
                return MuteItemExport(value: pk.hex(), expiration: exp)
            },
            hashtags: mutelistManager.hashtags.compactMap { item in
                guard case .hashtag(let tag, let exp) = item else {
                    Log.warning("Unexpected mute item type in hashtags set", for: .storage)
                    return nil
                }
                return MuteItemExport(value: tag.hashtag, expiration: exp)
            },
            words: mutelistManager.words.compactMap { item in
                guard case .word(let word, let exp) = item else {
                    Log.warning("Unexpected mute item type in words set", for: .storage)
                    return nil
                }
                return MuteItemExport(value: word, expiration: exp)
            },
            threads: mutelistManager.threads.compactMap { item in
                guard case .thread(let noteId, let exp) = item else {
                    Log.warning("Unexpected mute item type in threads set", for: .storage)
                    return nil
                }
                return MuteItemExport(value: noteId.hex(), expiration: exp)
            }
        )
    }

    private func gatherContacts() -> [String] {
        // Return sorted array for deterministic JSON output
        return contacts.get_friend_list().map { $0.hex() }.sorted()
    }
}

Extension Reader (ExtensionDataReader.swift)

/// Reads exported extension data in notification service
struct ExtensionDataReader {
    private let data: ExtensionDataExport?
    private let contactsSet: Set<String>  // O(1) lookup from sorted array

    init() {
        guard let url = Self.dataURL,
              let jsonData = try? Data(contentsOf: url)
        else {
            self.data = nil
            self.contactsSet = []
            return
        }

        let decoder = JSONDecoder()
        decoder.dateDecodingStrategy = .iso8601  // Must match encoder

        guard let decoded = try? decoder.decode(ExtensionDataExport.self, from: jsonData) else {
            self.data = nil
            self.contactsSet = []
            return
        }
        self.data = decoded
        self.contactsSet = Set(decoded.contacts)  // Convert to Set for O(1) lookup
    }

    private static let dataURL: URL? = {
        FileManager.default
            .containerURL(forSecurityApplicationGroupIdentifier: "group.com.damus")?
            .appendingPathComponent("extension_data.json")
    }()

    func profile(for pubkey: String) -> ProfileExport? {
        return data?.profiles[pubkey]
    }

    func isUserMuted(_ pubkey: String) -> Bool {
        guard let muteList = data?.muteList else { return false }
        return muteList.users.contains { item in
            item.value == pubkey && !item.isExpired
        }
    }

    func isFollowing(_ pubkey: String) -> Bool {
        return contactsSet.contains(pubkey)
    }

    func isWordMuted(_ content: String) -> Bool {
        guard let muteList = data?.muteList else { return false }
        let lowercased = content.lowercased()
        return muteList.words.contains { item in
            !item.isExpired && lowercased.contains(item.value.lowercased())
        }
    }
}

extension MuteItemExport {
    var isExpired: Bool {
        guard let exp = expiration else { return false }
        return exp < Date()
    }
}

Integration (DatabaseSnapshotManager.swift changes)

actor DatabaseSnapshotManager {
    private let ndb: Ndb
    private let exporter: ExtensionDataExporter

    func performSnapshot() async throws {
        // Replace full DB snapshot with JSON export
        try await exporter.exportData()

        UserDefaults.standard.set(Date(), forKey: Self.lastSnapshotDateKey)
        Log.info("Extension data export completed successfully", for: .storage)
    }
}

Tradeoffs

Aspect Assessment
Storage Savings 90-95% reduction
Implementation Complexity LOW
Code Maintainability Easy - Pure Swift, Codable
Risk Level Low - Well-understood format
Performance Impact Good for typical sizes
Extension Code Changes Moderate - New reader class
Testing Complexity Low - Easy to unit test

Benefits

  • Simplest implementation - Pure Swift, Codable
  • Human-readable for debugging
  • Easy to version and migrate
  • No database dependencies in extensions
  • Atomic writes with .atomic option
  • Easy to test (just JSON files)

Risks

  • O(n) word mute checking (mitigated by small list size)
  • Memory pressure with very large datasets (unlikely)
  • Must reload file if it changes (extensions are short-lived anyway)
  • No query capability (but not needed)

Design Consideration: Notification Sender Profiles

Problem: If a non-followed user mentions you, the notification would show a raw pubkey instead of their display name if we only export contacts + muted users.

Solution: Include recent notification senders in the profile export:

  1. Query events that tag our pubkey from the last 24 hours
  2. Extract unique sender pubkeys
  3. Include their profiles in the export

Tradeoff:

  • Adds ~100-500 more profiles to export (estimated)
  • Increases export time slightly
  • Prevents ugly raw pubkey display in notifications

Alternative: If querying recent mentions is expensive:

  • Fall back to truncated pubkey display (e.g., npub1abc...xyz)
  • This is acceptable UX for rare cases of mentions from unknown users
  • Can be implemented as progressive enhancement later

Performance Analysis

Typical data sizes:
- Profiles: 1,000-5,000 users × ~200 bytes = 200KB-1MB
- Contacts: 500-2,000 pubkeys × 64 bytes = 32KB-128KB
- Mute list: <1,000 items × ~100 bytes = <100KB
- Total: ~500KB-1.5MB (vs. 50-200MB full database)

Load time: <50ms for 1MB JSON
Lookup time: O(1) dictionary access for profiles

Complexity Breakdown

  • Data structures: ~60 lines
  • Exporter: ~100 lines
  • Reader: ~80 lines
  • Integration: ~20 lines
  • Tests: ~150 lines
  • Total: ~400 lines

Verdict

RECOMMENDED - Best complexity/benefit ratio.


Approach 4: Incremental/Delta Snapshots

Beads: optimize-storage-ndb-snapshots-ibb

Concept

Only copy data that changed since the last snapshot, reducing I/O and storage churn.

Implementation Challenges

The Core Problem

LMDB doesn't expose page-level change tracking. To implement deltas:

// Would need to track changes at application level
class ChangeTracker {
    var modifiedProfileKeys: Set<ProfileKey> = []
    var modifiedNoteKeys: Set<NoteKey> = []
    var lastSnapshotVersion: UInt64 = 0

    // Hook into every write operation
    func onProfileWrite(_ key: ProfileKey) {
        modifiedProfileKeys.insert(key)
    }
}

Delta Application Logic

func applyDelta(basePath: String, deltaPath: String) throws {
    // 1. Read delta file
    // 2. Open base database
    // 3. Apply each change
    // 4. Handle deletions
    // 5. Verify consistency
    // This is complex and error-prone
}

Tradeoffs

Aspect Assessment
Storage Savings Variable (0-80% depending on churn)
Implementation Complexity VERY HIGH
Code Maintainability Hard - Complex state management
Risk Level High - Consistency hard to verify
Performance Impact Variable
Extension Code Changes Significant
Testing Complexity Very High

Benefits

  • Minimal I/O per snapshot
  • Reduced battery impact
  • Could enable more frequent updates

Risks

  • Extremely complex to implement correctly
  • Corruption recovery is difficult
  • LMDB doesn't support this natively
  • Must track every write operation
  • Merge conflicts possible
  • Hard to verify correctness

Complexity Breakdown

  • Change tracking: ~500 lines
  • Delta generation: ~400 lines
  • Delta application: ~400 lines
  • Recovery logic: ~300 lines
  • Tests: ~800 lines
  • Total: ~2,400+ lines

Verdict

NOT RECOMMENDED - Complexity far exceeds benefits.


Approach 5: Compressed Snapshots

Beads: optimize-storage-ndb-snapshots-i3j

Concept

Compress the snapshot data before writing to shared container. Extensions decompress before use.

Implementation

Compression (DatabaseSnapshotManager.swift)

func performCompressedSnapshot() async throws {
    guard let snapshotPath = Ndb.snapshot_db_path,
          let compressedPath = Ndb.compressed_snapshot_path else {
        throw SnapshotError.pathsUnavailable
    }

    // 1. Create regular snapshot to temp location
    let tempPath = snapshotPath + ".tmp"
    try ndb.snapshot(path: tempPath)

    // 2. Compress the data.mdb file
    let dataPath = tempPath + "/data.mdb"
    let data = try Data(contentsOf: URL(fileURLWithPath: dataPath))
    let compressed = try (data as NSData).compressed(using: .lzfse)

    // 3. Write compressed data
    try compressed.write(to: URL(fileURLWithPath: compressedPath))

    // 4. Cleanup temp
    try FileManager.default.removeItem(atPath: tempPath)
}

Decompression in Extension

struct NotificationExtensionState {
    init?() {
        // Decompress before opening
        guard let compressedPath = Ndb.compressed_snapshot_path,
              let snapshotPath = Ndb.snapshot_db_path else {
            return nil
        }

        let compressedData = try? Data(contentsOf: URL(fileURLWithPath: compressedPath))
        let decompressed = try? (compressedData as? NSData)?.decompressed(using: .lzfse)
        try? decompressed?.write(to: URL(fileURLWithPath: snapshotPath + "/data.mdb"))

        // Now open LMDB
        guard let ndb = Ndb(owns_db_file: false) else { return nil }
        self.ndb = ndb
    }
}

Tradeoffs

Aspect Assessment
Storage Savings 50-70%
Implementation Complexity MEDIUM
Code Maintainability Easy
Risk Level Low
Performance Impact Negative - CPU overhead
Extension Code Changes Moderate
Testing Complexity Low

Benefits

  • Works with existing full-copy approach
  • Minimal code changes
  • Well-understood compression algorithms

Risks

  • Still copying entire database (just compressed)
  • CPU overhead on both compress and decompress
  • Extensions must decompress before LMDB can open
  • Adds latency to extension startup
  • LMDB can't memory-map compressed files

Performance Impact

Compression (LZFSE):
- 100MB database → ~40MB compressed
- Compression time: ~2-5 seconds
- Decompression time: ~1-2 seconds

Extension startup impact: +1-2 seconds latency

Verdict

Not ideal - Doesn't address root cause, adds CPU overhead.


Comparison Summary

Approach Storage Complexity Risk nostrdb Changes Recommended
1. Selective LMDB 70-80% High Medium Yes (C) Consider
2. Separate DB 85-95% Medium-High Medium No Good option
3. JSON Export 90-95% Low Low No YES
4. Incremental Variable Very High High Yes No
5. Compressed 50-70% Medium Low No No

Recommendation

Primary: Approach 3 (JSON Export)

Rationale:

  1. Simplicity - ~400 lines of pure Swift vs 2000+ for other approaches
  2. Safety - Codable, atomic writes, easy to debug
  3. Maximum savings - 90-95% storage reduction
  4. No nostrdb changes - Lower risk, easier review
  5. Testability - Simple unit tests, no database mocking needed

Fallback: Approach 2 (Separate SQLite DB)

If JSON performance becomes an issue with very large datasets (>10,000 profiles), upgrade to SQLite while keeping the same data model.

Not Recommended

  • Approach 4 (Incremental) - Complexity not justified
  • Approach 5 (Compressed) - Doesn't solve root problem

Next Steps

  1. Review and approve approach
  2. Implement ExtensionDataExporter
  3. Implement ExtensionDataReader
  4. Update NotificationExtensionState
  5. Remove LMDB snapshot code
  6. Add tests
  7. Measure actual storage savings
  8. Performance test with realistic data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment