| name | overview | todos | isProject | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Upstream chiavdf fork plan |
A phased plan to upstream the Ealrann/chiavdf fork's bluebox compaction optimizations into Chia-Network/chiavdf, enabling WesoForge to drop its fork dependency and adopt upstream directly. |
|
false |
The fork (Ealrann/chiavdf, branch bbr) diverges from upstream at commit 7d1f1d6 (Update license #295) with 7 commits on the main bbr branch and 2 additional commits on an unreleased "trick 2" side branch.
The diff is remarkably clean — only 7 files changed total, with ~1,030 lines added:
Modifications to existing upstream files (very small, ~35 lines of real change):
| File | Change |
|---|---|
[src/vdf.h](chiavdf/src/vdf.h) |
Added quiet_mode flag, vdf_fast_pairindex() function, changed pairindex=0 to =vdf_fast_pairindex(), wrapped a cout in quiet_mode check, removed one print |
[src/threading.h](chiavdf/src/threading.h) |
Increased master_counter[100] and slave_counter[100] to [512] |
[src/parameters.h](chiavdf/src/parameters.h) |
Changed enable_threads=true to enable_threads=false |
[src/Makefile.vdf-client](chiavdf/src/Makefile.vdf-client) |
Added fastlib target, PIC/PIE flags |
Purely additive new files (~1,000 lines):
| File | Purpose |
|---|---|
[src/c_bindings/fast_wrapper.h](chiavdf/src/c_bindings/fast_wrapper.h) |
C FFI header (146 lines) |
[src/c_bindings/fast_wrapper.cpp](chiavdf/src/c_bindings/fast_wrapper.cpp) |
Streaming prover implementation (796 lines) |
For bluebox compaction, y_ref is already known from the block. This means B = GetB(D, x, y_ref) can be computed before squaring starts. Instead of storing O(ceil(T/kl)) checkpoint forms and scanning them post-squaring, the prover updates proof buckets inline during the squaring loop via StreamingOneWesolowskiCallback. Result: ~3x memory reduction.
The naive GetBlock(p, k, T, B) does a full modular exponentiation per call. The fork observes that r_{p+1} = r_p * inv(2^k) mod B, so it maintains rolling state and computes each successive GetBlock with just a multiply+mod+div instead of an exponentiation. No lookup table needed.
Upstream's ApproximateParameters() is a fixed heuristic. The fork adds a grid search over (k, l) space constrained by a configurable per-worker memory budget, picking the minimum-cost parameters. This matters when running many workers in parallel with limited RAM per worker.
Jobs sharing the same (challenge, size_bits, x0) have an identical squaring trajectory f(t). The "trick 2" branch refactors StreamingOneWesolowskiCallback into a reusable StreamingWesolowskiBuckets class and adds a BatchOneWesolowskiCallback that runs repeated_square once for T_max while updating per-job bucket state at each job's checkpoint times. This reduces squaring work from sum(T_j) to max(T_j) across grouped jobs. This branch also includes a thorough design doc at BBR_BLUEBOX_COMPACTION_OVERVIEW.md.
The fork is well-structured for upstreaming because:
- Changes to existing files are minimal (~35 lines across 4 files)
- The bulk of the work is purely additive (new files that don't conflict with anything)
- No behavioral changes for existing callers — the streaming prover is an entirely new code path behind a new API
- The existing
c_bindings/c_wrapper.hAPI is untouched - The fork is only 7 commits ahead of a recent upstream commit
A single PR containing both the library-mode infrastructure changes and the new streaming prover. This is a natural unit because the infrastructure changes (~35 lines across 3 existing files) exist solely to support the streaming prover, and reviewers benefit from seeing the "why" alongside the "what".
Modifications to existing files:
**src/vdf.h**: Addquiet_modeglobal (defaultfalse, no behavior change). Addvdf_fast_pairindex(). Replace hardcodedpairindex=0withvdf_fast_pairindex(). Gate the "VDF loop finished"couton!quiet_mode.**src/threading.h**: Increase counter arrays from[100]to[512]. (Upstream only uses slot 0, so this is a zero-risk expansion.)**src/Makefile.vdf-client**: Add PIC/PIE build flags and thefastlibstatic library target.
New files (purely additive):
**src/c_bindings/fast_wrapper.h**: New C FFI header declaring the streaming prover API, memory budget setter, stats/parameter introspection.**src/c_bindings/fast_wrapper.cpp**: Complete implementation including:StreamingOneWesolowskiCallback(Trick 1: streaming bucket accumulation)- Incremental
get_block_opt()(GetBlock optimization) tune_streaming_parameters()((k,l) tuner)- All
chiavdf_prove_one_weso_fast*entry points - Memory budget, stats, and parameter introspection APIs
This PR should include the fork's README content as documentation (perhaps as docs/bluebox_compaction.md rather than replacing the main README).
Critical note on enable_threads: The fork flips enable_threads from true to false globally in parameters.h. This would break timelord operation if applied upstream. Instead:
- Do NOT change the
parameters.hdefault. - The streaming prover in
fast_wrapper.cppalready setsfast_algorithm=falseandtwo_weso=false, which bypasses the threaded proof paths. Theenable_threadsflip in the fork is belt-and-suspenders safety, not functionally required. - If a runtime flag is truly needed, propose a
chiavdf_set_enable_threads(bool)setter, but this can be deferred.
Based on the unreleased side branch (b6cc20a). Refactors StreamingOneWesolowskiCallback into StreamingWesolowskiBuckets (reusable per-job bucket state) and adds:
BatchOneWesolowskiCallbackwith event queueChiavdfBatchJobstruct andchiavdf_prove_one_weso_fast_streaming_getblock_opt_batch()C API- Finalization offloading to background threads
BBR_BLUEBOX_COMPACTION_OVERVIEW.mddesign document
This PR depends on PR 1 being merged first.
After PR 1 is merged upstream:
- Switch submodule in
[.gitmodules](.gitmodules)fromhttps://github.com/Ealrann/chiavdf.git(branchbbr) tohttps://github.com/Chia-Network/chiavdf.git(branchmain) - No changes needed to
[crates/chiavdf-fast/](crates/chiavdf-fast/)— the Rust FFI layer calls the same C API (fast_wrapper.h) - No changes needed to
[crates/client-engine/](crates/client-engine/)or[crates/client/](crates/client/)— they call the Rust API which wraps the same C API - Build system (
[crates/chiavdf-fast/build.rs](crates/chiavdf-fast/build.rs)) already usesmake -f Makefile.vdf-client fastlib, which will work once thefastlibtarget is upstream - Remove the
patches/directory if the GMP 6.3 patch gets merged upstream or becomes unnecessary - Test that all platforms (Linux x86, macOS Intel, macOS ARM, Windows) still build and produce valid proofs
After PR 2:
- WesoForge can adopt the batch API to replace its current per-job-per-worker model with grouped discriminant reuse, further improving throughput
- PR 1 risk: LOW — The infrastructure changes are minimal and backward-compatible. Counter array expansion is safe (only slot 0 is used upstream).
quiet_modedefaults tofalse.vdf_fast_pairindex()is backward-compatible (slot 0 is still used first). The newfast_wrapperfiles are entirely additive and don't change any existing behavior. - PR 2 risk: MEDIUM — Refactors the streaming callback, but only within
fast_wrapper.cpp(no upstream code changes). The batch API is more complex and may need iteration. **enable_threadsrisk: NONE if we don't touch it** — The fork's global flip is unnecessary for correctness and should not be upstreamed.
flowchart TD
subgraph inputs [Job Inputs]
challenge[challenge bytes]
y_ref[y_ref from block]
T[iterations T]
end
subgraph setup [Setup Phase]
D["D = CreateDiscriminant(challenge)"]
B["B = GetB(D, x, y_ref)"]
kl["(k,l) = tune or ApproximateParameters"]
buckets["Allocate l * 2^k buckets"]
end
subgraph squaring [Squaring Loop]
loop["repeated_square(T, x, D, ...)"]
checkpoint{"iteration % kl == 0?"}
getblock["b = get_block_opt(p)"]
update["bucket[j][b] *= checkpoint"]
end
subgraph finalize [Finalization]
fold["Fold buckets into proof form"]
serialize["Serialize y || proof"]
end
challenge --> D
y_ref --> B
D --> B
T --> kl
kl --> buckets
B --> getblock
buckets --> loop
loop --> checkpoint
checkpoint -->|yes| getblock
getblock --> update
update --> loop
checkpoint -->|no| loop
loop -->|"t == T"| fold
fold --> serialize