Skip to content

Instantly share code, notes, and snippets.

@hoffmang9
Created February 24, 2026 07:39
Show Gist options
  • Select an option

  • Save hoffmang9/6a848ff22f7cd29f4b6507600b099a5c to your computer and use it in GitHub Desktop.

Select an option

Save hoffmang9/6a848ff22f7cd29f4b6507600b099a5c to your computer and use it in GitHub Desktop.
Upstream chiavdf fork plan (PR1/PR2)
name overview todos isProject
Upstream chiavdf fork plan
A phased plan to upstream the Ealrann/chiavdf fork's bluebox compaction optimizations into Chia-Network/chiavdf, enabling WesoForge to drop its fork dependency and adopt upstream directly.
id content status
pr1-streaming-prover
PR 1: Streaming one-weso prover for bluebox compaction. Includes library-mode infra (vdf.h quiet_mode + pairindex, threading.h 512 slots, Makefile fastlib target) plus new fast_wrapper.h/cpp. Do NOT change enable_threads in parameters.h.
pending
id content status
pr2-batch-proving
PR 2 (future): Batch proving with discriminant reuse (Trick 2). Refactor StreamingOneWesolowskiCallback into StreamingWesolowskiBuckets, add BatchOneWesolowskiCallback and batch C API. Depends on PR 1.
pending
id content status
wesoforge-migration
After PR 1 merge: Switch WesoForge submodule from Ealrann/chiavdf to Chia-Network/chiavdf. No Rust code changes needed — FFI API is identical.
pending
false

Upstream chiavdf Fork Into Chia-Network/chiavdf

Current State of the Fork

The fork (Ealrann/chiavdf, branch bbr) diverges from upstream at commit 7d1f1d6 (Update license #295) with 7 commits on the main bbr branch and 2 additional commits on an unreleased "trick 2" side branch.

What the fork changes

The diff is remarkably clean — only 7 files changed total, with ~1,030 lines added:

Modifications to existing upstream files (very small, ~35 lines of real change):

File Change
[src/vdf.h](chiavdf/src/vdf.h) Added quiet_mode flag, vdf_fast_pairindex() function, changed pairindex=0 to =vdf_fast_pairindex(), wrapped a cout in quiet_mode check, removed one print
[src/threading.h](chiavdf/src/threading.h) Increased master_counter[100] and slave_counter[100] to [512]
[src/parameters.h](chiavdf/src/parameters.h) Changed enable_threads=true to enable_threads=false
[src/Makefile.vdf-client](chiavdf/src/Makefile.vdf-client) Added fastlib target, PIC/PIE flags

Purely additive new files (~1,000 lines):

File Purpose
[src/c_bindings/fast_wrapper.h](chiavdf/src/c_bindings/fast_wrapper.h) C FFI header (146 lines)
[src/c_bindings/fast_wrapper.cpp](chiavdf/src/c_bindings/fast_wrapper.cpp) Streaming prover implementation (796 lines)

The Three Optimizations

Trick 1: Streaming One-Wesolowski (known y_ref)

For bluebox compaction, y_ref is already known from the block. This means B = GetB(D, x, y_ref) can be computed before squaring starts. Instead of storing O(ceil(T/kl)) checkpoint forms and scanning them post-squaring, the prover updates proof buckets inline during the squaring loop via StreamingOneWesolowskiCallback. Result: ~3x memory reduction.

GetBlock Optimization: Incremental Mapping

The naive GetBlock(p, k, T, B) does a full modular exponentiation per call. The fork observes that r_{p+1} = r_p * inv(2^k) mod B, so it maintains rolling state and computes each successive GetBlock with just a multiply+mod+div instead of an exponentiation. No lookup table needed.

(k, l) Tuner: Memory-Budgeted Parameter Selection

Upstream's ApproximateParameters() is a fixed heuristic. The fork adds a grid search over (k, l) space constrained by a configurable per-worker memory budget, picking the minimum-cost parameters. This matters when running many workers in parallel with limited RAM per worker.

Trick 2 (unreleased, on side branch): Discriminant Reuse / Batch Proving

Jobs sharing the same (challenge, size_bits, x0) have an identical squaring trajectory f(t). The "trick 2" branch refactors StreamingOneWesolowskiCallback into a reusable StreamingWesolowskiBuckets class and adds a BatchOneWesolowskiCallback that runs repeated_square once for T_max while updating per-job bucket state at each job's checkpoint times. This reduces squaring work from sum(T_j) to max(T_j) across grouped jobs. This branch also includes a thorough design doc at BBR_BLUEBOX_COMPACTION_OVERVIEW.md.

Why Upstreaming is Feasible

The fork is well-structured for upstreaming because:

  1. Changes to existing files are minimal (~35 lines across 4 files)
  2. The bulk of the work is purely additive (new files that don't conflict with anything)
  3. No behavioral changes for existing callers — the streaming prover is an entirely new code path behind a new API
  4. The existing c_bindings/c_wrapper.h API is untouched
  5. The fork is only 7 commits ahead of a recent upstream commit

Recommended Upstreaming Strategy: 2 PRs

PR 1: "Streaming one-weso prover for bluebox compaction"

A single PR containing both the library-mode infrastructure changes and the new streaming prover. This is a natural unit because the infrastructure changes (~35 lines across 3 existing files) exist solely to support the streaming prover, and reviewers benefit from seeing the "why" alongside the "what".

Modifications to existing files:

  • **src/vdf.h**: Add quiet_mode global (default false, no behavior change). Add vdf_fast_pairindex(). Replace hardcoded pairindex=0 with vdf_fast_pairindex(). Gate the "VDF loop finished" cout on !quiet_mode.
  • **src/threading.h**: Increase counter arrays from [100] to [512]. (Upstream only uses slot 0, so this is a zero-risk expansion.)
  • **src/Makefile.vdf-client**: Add PIC/PIE build flags and the fastlib static library target.

New files (purely additive):

  • **src/c_bindings/fast_wrapper.h**: New C FFI header declaring the streaming prover API, memory budget setter, stats/parameter introspection.
  • **src/c_bindings/fast_wrapper.cpp**: Complete implementation including:
    • StreamingOneWesolowskiCallback (Trick 1: streaming bucket accumulation)
    • Incremental get_block_opt() (GetBlock optimization)
    • tune_streaming_parameters() ((k,l) tuner)
    • All chiavdf_prove_one_weso_fast* entry points
    • Memory budget, stats, and parameter introspection APIs

This PR should include the fork's README content as documentation (perhaps as docs/bluebox_compaction.md rather than replacing the main README).

Critical note on enable_threads: The fork flips enable_threads from true to false globally in parameters.h. This would break timelord operation if applied upstream. Instead:

  • Do NOT change the parameters.h default.
  • The streaming prover in fast_wrapper.cpp already sets fast_algorithm=false and two_weso=false, which bypasses the threaded proof paths. The enable_threads flip in the fork is belt-and-suspenders safety, not functionally required.
  • If a runtime flag is truly needed, propose a chiavdf_set_enable_threads(bool) setter, but this can be deferred.

PR 2 (future): "Batch proving with discriminant reuse" (Trick 2)

Based on the unreleased side branch (b6cc20a). Refactors StreamingOneWesolowskiCallback into StreamingWesolowskiBuckets (reusable per-job bucket state) and adds:

  • BatchOneWesolowskiCallback with event queue
  • ChiavdfBatchJob struct and chiavdf_prove_one_weso_fast_streaming_getblock_opt_batch() C API
  • Finalization offloading to background threads
  • BBR_BLUEBOX_COMPACTION_OVERVIEW.md design document

This PR depends on PR 1 being merged first.

WesoForge Migration Path

After PR 1 is merged upstream:

  1. Switch submodule in [.gitmodules](.gitmodules) from https://github.com/Ealrann/chiavdf.git (branch bbr) to https://github.com/Chia-Network/chiavdf.git (branch main)
  2. No changes needed to [crates/chiavdf-fast/](crates/chiavdf-fast/) — the Rust FFI layer calls the same C API (fast_wrapper.h)
  3. No changes needed to [crates/client-engine/](crates/client-engine/) or [crates/client/](crates/client/) — they call the Rust API which wraps the same C API
  4. Build system ([crates/chiavdf-fast/build.rs](crates/chiavdf-fast/build.rs)) already uses make -f Makefile.vdf-client fastlib, which will work once the fastlib target is upstream
  5. Remove the patches/ directory if the GMP 6.3 patch gets merged upstream or becomes unnecessary
  6. Test that all platforms (Linux x86, macOS Intel, macOS ARM, Windows) still build and produce valid proofs

After PR 2:

  • WesoForge can adopt the batch API to replace its current per-job-per-worker model with grouped discriminant reuse, further improving throughput

Risk Assessment

  • PR 1 risk: LOW — The infrastructure changes are minimal and backward-compatible. Counter array expansion is safe (only slot 0 is used upstream). quiet_mode defaults to false. vdf_fast_pairindex() is backward-compatible (slot 0 is still used first). The new fast_wrapper files are entirely additive and don't change any existing behavior.
  • PR 2 risk: MEDIUM — Refactors the streaming callback, but only within fast_wrapper.cpp (no upstream code changes). The batch API is more complex and may need iteration.
  • **enable_threads risk: NONE if we don't touch it** — The fork's global flip is unnecessary for correctness and should not be upstreamed.

Diagram: Data Flow with Streaming Prover

flowchart TD
    subgraph inputs [Job Inputs]
        challenge[challenge bytes]
        y_ref[y_ref from block]
        T[iterations T]
    end

    subgraph setup [Setup Phase]
        D["D = CreateDiscriminant(challenge)"]
        B["B = GetB(D, x, y_ref)"]
        kl["(k,l) = tune or ApproximateParameters"]
        buckets["Allocate l * 2^k buckets"]
    end

    subgraph squaring [Squaring Loop]
        loop["repeated_square(T, x, D, ...)"]
        checkpoint{"iteration % kl == 0?"}
        getblock["b = get_block_opt(p)"]
        update["bucket[j][b] *= checkpoint"]
    end

    subgraph finalize [Finalization]
        fold["Fold buckets into proof form"]
        serialize["Serialize y || proof"]
    end

    challenge --> D
    y_ref --> B
    D --> B
    T --> kl
    kl --> buckets
    B --> getblock

    buckets --> loop
    loop --> checkpoint
    checkpoint -->|yes| getblock
    getblock --> update
    update --> loop
    checkpoint -->|no| loop

    loop -->|"t == T"| fold
    fold --> serialize
Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment