ruvnet/ruvector.md

## ruvector.md

      
    Raw
  

              ruvector.md
            
          
    Advanced Mathematics for Next-Gen Vector Search

Making ruvector Smarter: Four Game-Changing Algorithms

This guide explains four powerful mathematical techniques that will differentiate ruvector from every other vector database on the market. Each solves a real problem that current databases can’t handle well.

1. Optimal Transport: “Earth Mover’s Distance”

What Is It?

Imagine you have two piles of sand and need to reshape one pile to match the other. Optimal transport calculates the minimum effort required to move the sand—how much mass moves, how far it travels.
For vectors, this means: how much “work” does it take to transform one distribution into another?
The Problem It Solves

Traditional vector similarity (cosine, Euclidean) compares point-to-point. But what if you’re comparing:

A document to another document (bags of word embeddings)
An image region to another region (sets of feature vectors)
A user’s behavior pattern to another user’s pattern

You need to compare sets of vectors, not individual vectors. Optimal transport does this naturally.
Key Features


Feature
What It Does


Wasserstein Distance
Measures “cost” to transform distribution A into B


Sinkhorn Algorithm
Fast approximation (1000x faster than exact)


Sliced Wasserstein
Ultra-fast 1D projections, scales to millions


Gromov-Wasserstein
Compares structures without shared space


Real-World Use Cases

Cross-Lingual Search

“Find French documents similar to this English document”
Problem: French and English embeddings live in different spaces.
Solution: Gromov-Wasserstein compares structure, not coordinates.

Image Retrieval

“Find product images similar to this photo”
Each image = set of regional features. OT finds minimum-cost matching between regions.

Time Series Matching

“Find similar user behavior patterns”
User sessions = sequences of action embeddings. OT handles variable-length, unordered comparisons.

Document Similarity

“Compare these two legal contracts”
Documents as bags of sentence embeddings. OT captures semantic overlap better than averaging.

Why No One Else Has This


Pinecone: Point vectors only
Weaviate: Point vectors only
Milvus: Point vectors only
ruvector: Set-to-set similarity via optimal transport ✓

Performance Profile

Exact EMD:        O(n³)     → 1M points = hours
Sinkhorn:         O(n²)     → 1M points = minutes  
Sliced Wasserstein: O(n log n) → 1M points = seconds ✓


2. Mixed-Curvature Geometry: “The Right Shape for Your Data”

What Is It?

The world isn’t flat. Neither is your data.

Euclidean space (flat): Good for grid-like data, images, general embeddings
Hyperbolic space (saddle-shaped): Perfect for hierarchies, trees, taxonomies
Spherical space (ball-shaped): Ideal for cyclical data, periodic patterns

Mixed-curvature combines all three: H^n × S^m × E^p
The Problem It Solves

Real data has mixed structure:

Organizational charts (hierarchical) + skills (flat) + quarterly cycles (periodic)
Product categories (tree) + features (flat) + seasonal trends (cyclical)
Knowledge graphs (hierarchical) + entity types (flat) + temporal patterns (cyclical)

Forcing everything into Euclidean space wastes dimensions and distorts relationships.
Key Features


Feature
What It Does


Poincaré Ball
Hyperbolic space for tree structures


Lorentz Model
Numerically stable hyperbolic (better for deep hierarchies)


Spherical Embeddings
Cyclical/periodic pattern representation


Product Manifolds
Combine spaces: 10D hyperbolic + 5D spherical + 100D Euclidean


Real-World Use Cases

E-commerce Product Search

Product taxonomy (categories → subcategories → items) is a tree.
Hyperbolic embeddings need 5 dimensions vs 100+ Euclidean for same quality.
Result: 20x memory reduction, better semantic accuracy.

Organizational Knowledge Graphs

Company hierarchy + skills database + project timelines
Mixed-curvature captures all three structures in unified space.

Biomedical Ontologies

Gene ontology, disease taxonomies, drug hierarchies
Hyperbolic space preserves parent-child relationships with 32% less distortion.

Social Network Analysis

Follower hierarchies (hyperbolic) + interest communities (Euclidean) + activity cycles (spherical)

Why ruvector Already Leads Here

ruvector v0.1.96 already has:

✅ Poincaré ball operations
✅ Lorentz model
✅ Hyperbolic attention mechanisms
✅ Dual-space attention (Euclidean + Hyperbolic)

Gap to fill: Product manifolds (H^n × S^m × E^p) for heterogeneous data.
The Numbers


Data Type
Euclidean Dims
Hyperbolic Dims
Memory Savings


WordNet hierarchy
200
10
20x


Organizational tree
128
8
16x


Product taxonomy
256
16
16x


3. Topological Data Analysis: “The Shape of Your Data”

What Is It?

TDA answers: What is the “shape” of my data?

How many clusters are there? (0-dimensional holes)
Are there loops or cycles? (1-dimensional holes)
Are there voids or cavities? (2-dimensional holes)

Unlike clustering algorithms that require you to specify k, TDA discovers structure automatically.
The Problem It Solves

Traditional clustering:

Requires specifying number of clusters upfront
Sensitive to noise and outliers
Misses complex shapes (rings, spirals, Swiss rolls)

TDA provides parameter-free shape analysis that’s robust to noise.
Key Features


Feature
What It Does


Persistent Homology
Finds features that “persist” across scales


Betti Numbers
Counts holes in each dimension (β₀=clusters, β₁=loops, β₂=voids)


Persistence Diagrams
Visualizes feature birth/death across scales


Mapper Algorithm
Creates interpretable network summary of high-D data


Real-World Use Cases

Embedding Quality Validation

“Are my embeddings actually forming meaningful clusters?”
TDA reveals if your embedding space has the right topology—before you deploy.

Anomaly Detection

Normal data forms specific shapes. Anomalies break the shape.
TDA detects structural anomalies invisible to distance-based methods.

Coverage Analysis

“Does my training data cover the problem space?”
TDA identifies holes in coverage—regions where you have no training examples.

Semantic Cluster Discovery

Without specifying k, discover natural groupings in document embeddings.
Persistence tells you which clusters are “real” vs artifacts of threshold choice.

The Mapper Pipeline for Vector DBs

Vectors → Filter Function → Covering → Clustering → Network Graph
   ↓           ↓                ↓           ↓            ↓
 Your data   PCA/density    Overlapping   Per-bin     Interpretable
             projection      intervals    clusters     visualization

Use Case: Zilliz’s TDA Integration

Zilliz (Milvus creators) published research showing TDA improves embedding model selection:

Identifies which embedding model best preserves data structure
Reveals training gaps before production deployment
Quantifies embedding quality beyond simple benchmarks

Why This Matters for ruvector

No vector database offers built-in TDA. This enables:

npx ruvector tda analyze ./my-vectors.db → Automatic shape report
npx ruvector tda validate → Embedding quality score
npx ruvector tda coverage → Training gap identification


4. Information Geometry: “Smarter Learning”

What Is It?

Information geometry treats probability distributions as points on a curved surface.
The key insight: when optimizing ML models, the “natural” direction isn’t always the steepest descent—it’s the direction that accounts for the curvature of the probability landscape.
The Problem It Solves

Standard gradient descent treats all parameter directions equally. But parameters aren’t equal:

Some directions are “steep” (small changes = big effects)
Some directions are “flat” (large changes = small effects)

Natural gradient rescales updates based on information geometry, converging faster and more stably.
Key Features


Feature
What It Does


Fisher Information Matrix
Measures “curvature” of probability space


Natural Gradient Descent
Gradient rescaled by inverse Fisher matrix


K-FAC
Efficient approximation (makes natural gradient practical)


α-Divergences
Generalized family including KL divergence


Real-World Use Cases

Faster Index Training

HNSW and GNN-enhanced indexes require optimization.
Natural gradient converges in 3-5x fewer iterations than Adam.

Embedding Fine-Tuning

Fine-tuning embedding models with natural gradient:

Faster convergence
Better generalization
More stable training


Online Learning

As your vector DB learns from queries, natural gradient provides:

Efficient incremental updates
Stable continual learning
Better catastrophic forgetting resistance


Probabilistic Embeddings

Instead of point vectors, represent uncertainty:

Each embedding = Gaussian distribution
Fisher-Rao distance measures distributional similarity
Captures “I’m not sure about this entity” naturally


Why This Is Under the Radar

Information geometry is mathematically sophisticated but practically powerful. No Rust implementation exists. Building it into ruvector means:

Faster self-learning index optimization
Better embedding fine-tuning for RAG
Unique capability no competitor offers


Feature Comparison: ruvector vs Competition


Capability
Pinecone
Weaviate
Milvus
Chroma
ruvector


Euclidean/Cosine
✅
✅
✅
✅
✅


Hyperbolic Space
❌
❌
❌
❌
✅


Mixed-Curvature
❌
❌
❌
❌
🔜


Optimal Transport
❌
❌
❌
❌
🔜


Persistent Homology
❌
❌
❌
❌
🔜


Natural Gradient
❌
❌
❌
❌
🔜


GNN-Enhanced Index
❌
❌
❌
❌
✅


WASM Browser
❌
❌
❌
✅
✅


Implementation Roadmap

Phase 1: Optimal Transport (Weeks 1-3)


Sliced Wasserstein for fast set-to-set similarity
Log-stabilized Sinkhorn for accuracy when needed
CLI: npx ruvector transport distance

Phase 2: Product Manifolds (Weeks 3-6)


Extend existing hyperbolic to full product spaces
Learnable curvature parameters
CLI: npx ruvector create --geometry "hyperbolic:10,spherical:5,euclidean:100"

Phase 3: TDA Integration (Weeks 6-10)


Persistent homology with apparent pairs optimization
Mapper algorithm for exploration
CLI: npx ruvector tda analyze, npx ruvector tda validate

Phase 4: Information Geometry (Weeks 10-14)


Fisher information computation via autodiff
Natural gradient optimizer for index training
CLI: npx ruvector train --optimizer natural-gradient


Summary: Why This Matters

These four algorithms transform ruvector from “another vector database” to “the vector database that understands data shape”:


Algorithm
One-Line Value Prop


Optimal Transport
Compare sets of vectors, not just points


Mixed-Curvature
10-20x memory savings on hierarchical data


TDA
Parameter-free shape discovery and validation


Information Geometry
3-5x faster index optimization


Together, they enable use cases no other vector database can touch: cross-lingual retrieval, topology-aware clustering, hierarchical knowledge graphs, and self-improving indexes.
The Rust/WASM implementation means these capabilities run everywhere—server, edge, browser—with native performance.
Rust/WASM Mathematical Algorithm Landscape for Vector Database Integration

The Rust ecosystem for advanced mathematical algorithms required by modern vector databases reveals a fragmented but maturing landscape.
Optimal transport, persistent homology, and information geometry implementations are nascent, while non-Euclidean geometry shows the most promise through production-ready implementations like ruvector. Custom development will be required for most production deployments, but WASM-compatible foundations exist across all domains.
Executive summary: ecosystem maturity by domain


Domain
Maturity
Production-Ready?
WASM-Compatible
Recommended Path


Optimal Transport
★☆☆☆☆
No
Blocked (BLAS deps)
Custom implementation


Manifold/Hyperbolic
★★★☆☆
Partial (ruvector)
Yes
Extend ruvector


TDA/Persistent Homology
★★☆☆☆
No
Possible (phlite)
Port Ripser or extend phlite


Information Geometry
★☆☆☆☆
No
Yes (via burn)
Build new crate


The critical insight: no single Rust crate provides production-grade implementations for these algorithms with WASM compatibility. However, foundational linear algebra (nalgebra, ndarray) and autodiff (burn) are mature and WASM-ready, enabling custom implementations.

Optimal Transport implementations are critically underdeveloped

The Rust optimal transport ecosystem centers on rust-optimal-transport, a stale crate (last updated 3+ years ago) with fundamental WASM blockers. Only basic Sinkhorn and EMD are implemented—** no Gromov-Wasserstein, Sliced Wasserstein, or differentiable variants exist**.
Available implementations


Crate
Algorithms
WASM?
Status


rust-optimal-transport
EMD, Sinkhorn, Unbalanced
❌ Blocked
Inactive (3+ years)


optimal-transport-rs
Minimal
Unknown
Not recommended


rust-emd
EMD only
❌ C++ wrapper
Limited


The primary blocker for WASM compilation is dependency on ndarray-linalg and C++ FFI for the network simplex algorithm. The EMD solver wraps C++ code  via cxx, which cannot compile to wasm32-unknown-unknown.
Critical algorithm gaps versus Python POT

Python’s POT library provides 40+ algorithms absent from Rust: Gromov-Wasserstein (critical for cross-domain similarity), Sliced Wasserstein (near-linear scaling), Low-rank Sinkhorn (memory-efficient), Wasserstein barycenters, and partial optimal transport. For vector database integration, Sliced Wasserstein is particularly important—it provides O(n log n) complexity versus O(n³) for exact EMD.
Recommended architecture for WASM-compatible OT

A pure-Rust implementation should use nalgebra (fully WASM-compatible) instead of ndarray-linalg, with this module structure:
pub mod optimal_transport {
    pub mod sinkhorn {
        pub struct LogStabilizedSinkhorn { reg: f32, max_iter: usize }
        pub struct GreedySinkhorn;
    }
    pub mod sliced {
        pub fn sliced_wasserstein(a: &[f32], b: &[f32], n_proj: usize) -> f32;
    }
    pub mod gromov {
        pub fn entropic_gw(Ca: &Array2, Cb: &Array2, reg: f32) -> Array2;
    }
}
GPU acceleration via wgpu is viable for browser deployment—Sinkhorn’s matrix-vector operations map well to WebGPU compute shaders.

Hyperbolic and mixed-curvature geometry shows most promise

The ruvector project provides the most comprehensive Rust implementation for non-Euclidean geometry, including Poincaré ball and Lorentz/hyperboloid models with full WASM support. Performance benchmarks show 143ns for cosine distance on 1536-dimensional vectors and 61µs for HNSW k=10 search.
Available Rust implementations


Crate
Models
Operations
WASM


ruvector-core
Poincaré, Lorentz
expMap, logMap, Möbius addition
✅


ruvector-attention
Hyperbolic attention
39 attention mechanisms
✅


manopt-rs
Generic Riemannian
RiemannianAdam/SGD
Likely


Critical gap: product manifolds

No Rust equivalent exists for geoopt’s ProductManifold, which enables mixed-curvature representations (H^n × S^m × E^p). Research shows mixed-curvature embeddings achieve 32.55% distortion reduction on social network data versus single-space embeddings.  This capability is essential for heterogeneous data combining hierarchical (hyperbolic), cyclical (spherical), and grid-like (Euclidean) structures.
Numerical stability requires careful model selection

The Poincaré ball model has a hard numerical boundary at radius ~38 in float64—points beyond this collapse to the boundary. The Lorentz/hyperboloid model provides better numerical properties for deep hierarchies. Recommendation: train in Lorentz, project to Poincaré for visualization.
Proposed trait hierarchy for manifold abstractions

pub trait Manifold: Send + Sync {
    fn dimension(&self) -> usize;
    fn curvature(&self) -> f64;
}

pub trait ExpLogMaps: Manifold {
    fn exp_map(&self, base: &DVector<f64>, tangent: &DVector<f64>) -> DVector<f64>;
    fn log_map(&self, base: &DVector<f64>, point: &DVector<f64>) -> DVector<f64>;
}

pub trait RiemannianOptimizable: ExpLogMaps {
    fn egrad_to_rgrad(&self, point: &DVector<f64>, egrad: &DVector<f64>) -> DVector<f64>;
    fn parallel_transport(&self, from: &DVector<f64>, to: &DVector<f64>, v: &DVector<f64>) -> DVector<f64>;
}

pub struct ProductManifold {
    components: Vec<Box<dyn RiemannianOptimizable>>,
    signature: Vec<usize>,
}

Topological data analysis implementations lag C++ by 40x in performance

Rust TDA crates exist but none match Ripser’s 40x speedup over competitors. The phlite crate is most promising, implementing Ripser’s R=DV decomposition and clearing optimization, but lacks apparent pairs and emergent pairs shortcuts that provide Ripser’s dramatic performance gains.
Performance comparison: the Ripser benchmark

On 192 points on S² at dimension 2 (56M simplices):


Implementation
Time
Memory


Dionysus (C++)
533s
3.4 GB


GUDHI (C++)
75s
2.9 GB


Ripser (C++)
1.2s
152 MB


phlite (Rust)
~10-20s*
~500 MB*


*Estimated based on algorithm description; no published benchmarks
Rust TDA crate analysis


Crate
Key Features
Gaps


phlite
VR filtration, R=DV decomposition, clearing
No apparent/emergent pairs


tda
Rips, Betti numbers, Mapper, Bottleneck distance
Limited optimization


lophat
Parallel computation (rayon)
Beta status, no VR-specific


teia
VR, lower-star complex
Basic implementation


None implement Alpha complexes, Čech complexes, Wasserstein distance for persistence diagrams, or Z/pZ coefficient fields beyond Z/2Z.
WASM viability demonstrated by live.ripser.org

Ripser compiles to WASM via Emscripten and runs in browsers at live.ripser.org. This proves TDA is computationally feasible in WASM. For Rust, phlite should compile to wasm32-unknown-unknown given its pure-Rust implementation with nalgebra.
Vector database integration pattern

TDA enables topological clustering for vector spaces—Betti numbers validate cluster counts, persistence diagrams reveal shape structure, and Mapper provides interpretable dimension reduction.  Integration architecture:
Vector DB → Point Cloud → Distance Matrix → Rips Complex → Persistence
                              ↓                                  ↓
                      Nearest Neighbors (ANN)          Topological Features
                              ↓
                    Betti Numbers | Persistence Diagrams
                           ↓                    ↓
                    Cluster Validation    Shape Analysis


Information geometry represents the largest ecosystem gap

No dedicated information geometry library exists in Rust. This is a significant gap—Fisher-Rao metrics, natural gradient descent, and α-divergences are absent from the ecosystem. Building blocks exist but are fragmented across multiple crates.
Available components


Component
Crate
Status


KL divergence
ndarray-stats
✅ Production-ready


Cross-entropy
ndarray-stats
✅ Production-ready


Shannon entropy
ndarray-stats
✅ Production-ready


Automatic differentiation
burn, candle
✅ Excellent


Second-order optimization
argmin (BFGS, L-BFGS)
✅ Mature


Fisher information
—
❌ Missing


Natural gradient
—
❌ Missing


K-FAC
—
❌ Missing


α-divergences
—
❌ Missing


Missing algorithms critical for ML optimization

Natural Gradient Descent (NGD) uses the Fisher information matrix to precondition gradients, providing curvature-aware updates. K-FAC (Kronecker-Factored Approximate Curvature) approximates Fisher information as Kronecker products, dramatically reducing computational cost from O(n²) to O(n).
Neither exists in Rust. Implementation requires:

Fisher information computation via autodiff (burn provides Hessians)
Natural gradient update: θ_{t+1} = θ_t - η F^{-1} ∇L(θ_t)
K-FAC block-diagonal approximation for scalability

Proposed divergence trait hierarchy

pub trait Divergence<T> {
    fn divergence(&self, p: &T, q: &T) -> f64;
    fn is_symmetric(&self) -> bool { false }
}

pub trait AlphaDivergence<T>: Divergence<T> {
    fn alpha(&self) -> f64;  // α=1 → KL(p||q), α=-1 → KL(q||p)
}

pub trait FisherRaoMetric<P> {
    fn fisher_information(&self, p: &P) -> Array2<f64>;
    fn geodesic_distance(&self, p: &P, q: &P) -> f64;
}

pub trait NaturalGradient<B: Backend> {
    fn compute_fisher(&self, params: &Tensor<B, 1>) -> Tensor<B, 2>;
    fn natural_step(&mut self, params: &Tensor<B, 1>, grad: &Tensor<B, 1>, lr: f64) -> Tensor<B, 1>;
}

WASM compatibility requires pure-Rust implementations

WASM deployment faces consistent challenges across all domains: BLAS/LAPACK dependencies block compilation, C++ FFI is incompatible, and memory is constrained to 4GB. The solution is pure-Rust implementations using WASM-compatible linear algebra.
WASM-compatible linear algebra stack


Crate
WASM Support
Use Case


nalgebra
✅ Full
Statically-sized matrices, decompositions


ndarray (no blas feature)
✅ Full
Dynamic arrays, no decompositions


linfa-linalg
✅ Full
Pure-Rust LAPACK replacement


burn (wgpu backend)
✅ Full
GPU via WebGPU


WebGPU acceleration path

The burn framework with wgpu backend enables GPU-accelerated tensor operations in browsers via WebGPU. This is viable for:

Sinkhorn iterations (matrix-vector products)
Hyperbolic distance computations (batch operations)
Fisher information estimation (gradient outer products)

Bundle size and performance considerations


Consideration
Impact
Mitigation


f64 vs f32
Larger WASM, better precision
f64 for hyperbolic, f32 elsewhere


Threading
SharedArrayBuffer restrictions
Web Workers for parallelism


SIMD
WebAssembly SIMD now available
Use 128-bit SIMD intrinsics


Memory growth
Large filtrations may exceed limits
Streaming/chunked APIs


Architectural recommendations for ruvector integration

Domain-Driven Design bounded contexts

┌─────────────────────────────────────────────────────────────────┐
│                      SIMILARITY CONTEXT                          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │
│  │  Euclidean  │  │ Hyperbolic  │  │    Optimal Transport    │ │
│  │  Distance   │  │  Distance   │  │  (Wasserstein/Sinkhorn) │ │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│                      TOPOLOGY CONTEXT                            │
│  ┌─────────────────────┐  ┌────────────────────────────────┐   │
│  │  Persistent Homology │  │  Topological Clustering (Mapper)│   │
│  └─────────────────────┘  └────────────────────────────────┘   │
├─────────────────────────────────────────────────────────────────┤
│                      GEOMETRY CONTEXT                            │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │
│  │  Poincaré   │  │   Lorentz   │  │   Product Manifold      │ │
│  │  Ball Model │  │   Model     │  │   (H^n × S^m × E^p)     │ │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│                      OPTIMIZATION CONTEXT                        │
│  ┌─────────────────────┐  ┌────────────────────────────────┐   │
│  │  Natural Gradient    │  │  Riemannian Optimization       │   │
│  └─────────────────────┘  └────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘

Implementation priority matrix


Algorithm
Priority
Complexity
Dependencies
Timeline


Sliced Wasserstein
🔴 Critical
Medium
nalgebra
2-3 weeks


Log-stabilized Sinkhorn
🔴 Critical
Low
nalgebra
1 week


Product Manifold
🔴 Critical
Medium
ruvector-core
3-4 weeks


Apparent pairs (TDA)
🟡 High
High
phlite fork
4-6 weeks


Natural Gradient
🟡 High
Medium
burn
3-4 weeks


Gromov-Wasserstein
🟢 Medium
High
nalgebra
6-8 weeks


K-FAC optimizer
🟢 Medium
High
burn
6-8 weeks


Recommended crate structure

ruvector-math/
├── Cargo.toml
├── src/
│   ├── lib.rs
│   ├── manifold/
│   │   ├── mod.rs
│   │   ├── poincare.rs      # From ruvector-core
│   │   ├── lorentz.rs       # From ruvector-core
│   │   ├── spherical.rs     # New implementation
│   │   └── product.rs       # New: H^n × S^m × E^p
│   ├── transport/
│   │   ├── mod.rs
│   │   ├── sinkhorn.rs      # Log-stabilized, greedy
│   │   ├── sliced.rs        # Sliced Wasserstein
│   │   └── gromov.rs        # Gromov-Wasserstein
│   ├── topology/
│   │   ├── mod.rs
│   │   ├── rips.rs          # VR complex construction
│   │   ├── persistence.rs   # R=DV with apparent pairs
│   │   └── mapper.rs        # Topological clustering
│   └── info_geom/
│       ├── mod.rs
│       ├── divergence.rs    # KL, α-divergences
│       ├── fisher.rs        # Fisher information
│       └── natural_grad.rs  # NGD optimizer


Conclusion: build on existing foundations with targeted custom development

The Rust ecosystem provides strong foundations (nalgebra, burn, ndarray) but lacks algorithm-specific implementations for advanced mathematical methods. The recommended strategy:

Extend ruvector for hyperbolic/manifold operations—it’s the most mature implementation
Fork phlite and implement apparent/emergent pairs for competitive TDA performance
Build new optimal transport crate using nalgebra (pure Rust, WASM-compatible)
Create information geometry crate leveraging burn’s autodiff for Fisher computation

For WASM deployment, avoid any crate with BLAS/LAPACK or C++ FFI dependencies. The pure-Rust path (nalgebra + burn + custom algorithms) enables browser deployment with WebGPU acceleration via wgpu.
The 40x performance gap between Rust TDA and Ripser, and the complete absence of natural gradient optimizers, represent the most significant technical debt. Addressing these gaps positions ruvector as the first production-grade vector database with native support for non-Euclidean geometry and topological analysis.
Feature	What It Does
Wasserstein Distance	Measures “cost” to transform distribution A into B
Sinkhorn Algorithm	Fast approximation (1000x faster than exact)
Sliced Wasserstein	Ultra-fast 1D projections, scales to millions
Gromov-Wasserstein	Compares structures without shared space
Feature	What It Does
Poincaré Ball	Hyperbolic space for tree structures
Lorentz Model	Numerically stable hyperbolic (better for deep hierarchies)
Spherical Embeddings	Cyclical/periodic pattern representation
Product Manifolds	Combine spaces: 10D hyperbolic + 5D spherical + 100D Euclidean
Data Type	Euclidean Dims	Hyperbolic Dims	Memory Savings
WordNet hierarchy	200	10	20x
Organizational tree	128	8	16x
Product taxonomy	256	16	16x
Feature	What It Does
Persistent Homology	Finds features that “persist” across scales
Betti Numbers	Counts holes in each dimension (β₀=clusters, β₁=loops, β₂=voids)
Persistence Diagrams	Visualizes feature birth/death across scales
Mapper Algorithm	Creates interpretable network summary of high-D data
Feature	What It Does
Fisher Information Matrix	Measures “curvature” of probability space
Natural Gradient Descent	Gradient rescaled by inverse Fisher matrix
K-FAC	Efficient approximation (makes natural gradient practical)
α-Divergences	Generalized family including KL divergence
Capability	Pinecone	Weaviate	Milvus	Chroma	ruvector
Euclidean/Cosine	✅	✅	✅	✅	✅
Hyperbolic Space	❌	❌	❌	❌	✅
Mixed-Curvature	❌	❌	❌	❌	🔜
Optimal Transport	❌	❌	❌	❌	🔜
Persistent Homology	❌	❌	❌	❌	🔜
Natural Gradient	❌	❌	❌	❌	🔜
GNN-Enhanced Index	❌	❌	❌	❌	✅
WASM Browser	❌	❌	❌	✅	✅
Algorithm	One-Line Value Prop
Optimal Transport	Compare sets of vectors, not just points
Mixed-Curvature	10-20x memory savings on hierarchical data
TDA	Parameter-free shape discovery and validation
Information Geometry	3-5x faster index optimization
Domain	Maturity	Production-Ready?	WASM-Compatible	Recommended Path
Optimal Transport	★☆☆☆☆	No	Blocked (BLAS deps)	Custom implementation
Manifold/Hyperbolic	★★★☆☆	Partial (ruvector)	Yes	Extend ruvector
TDA/Persistent Homology	★★☆☆☆	No	Possible (phlite)	Port Ripser or extend phlite
Information Geometry	★☆☆☆☆	No	Yes (via burn)	Build new crate
Crate	Algorithms	WASM?	Status
`rust-optimal-transport`	EMD, Sinkhorn, Unbalanced	❌ Blocked	Inactive (3+ years)
`optimal-transport-rs`	Minimal	Unknown	Not recommended
`rust-emd`	EMD only	❌ C++ wrapper	Limited
Crate	Models	Operations	WASM
ruvector-core	Poincaré, Lorentz	expMap, logMap, Möbius addition	✅
ruvector-attention	Hyperbolic attention	39 attention mechanisms	✅
manopt-rs	Generic Riemannian	RiemannianAdam/SGD	Likely