Implementing New Range Proof Technique Bulletproofs+ in Monero
Overview
Hello everyone! This CCS proposal is for implementing Bulletproofs+ for range proofs in Monero. Currently, Bulletproofs is employed for range proofs in Monero. Using Bulletproofs+, we aim to save about 10% of the data being pushed to Monero blockchain everyday. We already have a proof of concept implementation of Bulletproofs+ in Rust. Our simulations show that Bulletproofs+ could speed up proof generation and verification by 21% and 17% (Note: These improvements in timings will vary depending on how efficient the group operations over Ed25519 curve used in Monero are). In blogs 1 and 2 we cover in depth the techniques used in Bulletproofs+ and how they differ with Bulletproofs. Read below for more details on applicability of Bulletproofs+ to Monero.
Team and Scope
I am Suyash Bagad and I just graduated with B.Tech and M.Tech in Electrical Engineering from IIT Bombay (India). My thesis was primarily on designing and implementing novel proofs of reserves protocols for crypto exchanges mainly for Monero and Grin. I have also worked on a few other topics in cryptocurrency research.
This will be a joint project together with @omershlo who is experienced in implementing complex cryptographic systems. Our goal is to provide a full implementation of the protocol (aggregated version) from the Bulletproof+ paper while re-using as much of the existing code base utilized for Bulletproofs.
Funding Note
We estimate to complete the project in about 3 months in 3 steps as below:
- Working Proof of Concept
- Optimizations and Code quality
- Benchmarking
We need a funding equivalent to a standard researcher's salary of $10k per month amounting to ~450 XMR (for 3 months) according to the moving average on Kraken. The funding is proposed in 3 payouts of 50%, 30% and 20% respectively for the above 3 steps. This project will include both me and @omershlo working as well as academic advisory from Claudio Orlandi.
Details on how Bulletproofs+ can improve Monero
Security wise, Bulletproofs+ do not introduce any new assumptions on top of Bulletproofs. It uses the weighted inner product protocol while Bulletproofs uses inner product protocol. Each Monero transaction contains 2.5 outputs on an average. This means that each transaction is accompanied by 2.5 range proofs (Bulletproofs as of now). As Monero uses the Ed25519 curve, the size of a single Bulletproofs proof is 676 bytes [1]. This could be reduced by 96 bytes by using Bulletproofs+. In effect, about 240 bytes per transaction could be saved. Average number of transactions on July 16th was 11,417 [2]. This implies that about 2.7 MB of data can be saved everyday [3]. From the start of 2020 till this submission date, an average of 28 MB of data is added daily to the blockchain. Therefore, Bulletproofs+ can save about 10% of the data being added to the Monero blockchain!
The running times of Bulletproofs and Bulletproofs+ over Ed25519 curve are noted below in Table 1. Note that the underlying Edwards curve's operations are used from cryptoxide library which is slower than the optimized operations in the more recent curve25519-dalek's implementation. As we are interested only in comparison as of now, unoptimized operations in the underlying curve library won't matter. Bulletproofs+ shows a 21% faster proof generation and 17% faster proof verification than Bulletproofs. Although these numbers would slightly vary when the underlying library for group operations changes, they still look quite promising from a user (prover) as well as miner (verifier) point of view!
[1]: Monero uses the Edwards curve Ed25519 in which public and private keys are
[2]: Monero Block Explorer, https://moneroblocks.info/stats/
[3]: If we assume that the range proofs for outputs per block are aggregated (i.e. they are owned by a single entity), then Bulletproofs+ can save about
m | ||||||
---|---|---|---|---|---|---|
Proof size (B) | Generation time (ms) | Verification time (ms) | ||||
BP | BP+ | BP | BP+ | BP | BP+ | |
1 | 672 | 576 | 140.5 | 109.8 21.8% | 56.5 | 46.8 17.2% |
4 | 800 | 704 | 550.3 | 431.6 21.5% | 214.3 | 179.4 16.3% |
8 | 864 | 768 | 1095.8 | 864.8 21.0% | 423.5 | 359.0 15.2% |
16 | 928 | 832 | 2190.3 | 1772.0 19.0% | 839.8 | 730.0 13.0% |
32 | 992 | 896 | 4392.0 | 3687.4 14.8% | 1702.4 | 1528.3 10.2% |
Merge request reports
Activity
Hi, this is a well written proposal and worth thinking about. Three points:
- There doesn't appear to be a request for funds. Are you intending to get paid for this project? If not, the proposal may be better placed in the Monero project github issues.
- Bulletproofs+ was published only recently (barely one month ago), and afaik has not been peer reviewed. Even if the implementation was ready to go today, it is unclear when it would be plugged in. Note that the original Bulletproofs implementation was professionally audited.
- You may want to connect with the Monero Research Lab, which can be found on IRC in #monero-research-lab.
Edited by koeThanks @koe ,
-
There is a request for funds but it was left out of the overview. We will update it. To save some time, the total funds required amounts to 3 man-month researcher salary (XMR value based on Kraken moving average )
-
I cannot argue here, however:
- Bulletproofs+ was written by world renowned researchers
- I think that the amount of attention @suyash67 and I gave to the manuscript is well behind a standard peer review. IMHO (I reviewed papers on similar topics for CRYPTO, CCS and few other conferences in the past year) I can estimate that future versions of the paper will not present major deviations. That said, I would agree that more eyes are always better and having the paper accepted in a major conference would put it more under public scrutiny.
- Even after peer review, applied papers may still have flaws. In Bulletproofs+ case the protocol is elegant and simple, which minimises such risk.
- about audits: We could definitely manage the audit process. However, we felt it is healthier to separate the two processes. At first - we will write the code, test it and bring it to the community to further evaluate its added value and overall quality. Afterwards, the community can engage with cryptography audit prior to plugging it in
- We will, thanks
-
Re-iterating the response of @omershlo on MRL IRC channel to answer this:
Mapping the concerns about BP+ project:
- BP+ cost effectiveness compared to BP is unclear,
- We feel unease with outsiders writing cryptography code,
- Sarang can do it better (please let me know if I missed anything).
I completely agree with the above statements. That said: THIS is the way to get talented researchers involved in the community and in the weeds. Knowing Suyash skill and dedication, this project might end up much faster than 3 months. However, we didn't see any reason to rush it, after all , this code is a cryptographic component of critical low level code.
By code reuse, we mean that we want to keep the same interface as much as possible, use the same functions used in BP as much as possible (to make audit and review easier). Ideally, keeping similar naming/naming conventions and code structure.
Continued:
To be more specific on the timeline: we figured that a quick and dirty PoC should be the first step. Our goal is to get to that point as fast as we can. We didn't go into the week by week resolution, but we assumed that this can be done within roughly one month including getting familiar with the existing code (its been two years since last time I read the code).
After getting this milestone we will dedicate time to make the code "production ready", working on readability, structure, security (removing side channels for example), putting emphasis on code re-use, and perhaps some low hanging optimisations. This is where we will do "deep integration". We again didn't go into weekly resolution but assumed that one month will be enough (this step can go on forever but we want to put an hard stop).
The final phase is for measurements, profiling , unit testing and fine grained optimisations. I believe we will be able to produce a report with results. Maybe even to make recommendations for some other parts of the code. We are not familiar enough with existing frameworks for benchmarking, testing and unit testing that are already in place - we might be overshooting this step by giving it 1 month time but we consider it important part of the project and prefer to overshoot than undershoot.
The entire conversation can be found here.
Additionally, can you provide efficiency estimates based on multiscalar multiplication operation counts instead? Given that the bulk of computation comes from a single such operation, this might be better for understanding the time savings since the codebase already includes performances tests for it.
The differences in the number of multi-scalar multiplications (and scalar multiplications too) involved in each step of WIP and IP are noted in the figure below.
Similarly, the computation differences in BP and BP+ are shown below.
Note:
cm
is multiplications of a scalar and a curve point whilesm
is scalar multiplication in the underlying prime field. Also, A multi-scalar multiplication of sizen
in groupG
implies the operationa * g
inG
wherea
is a scalar vector andg
is a vector of curve points.Key Differences
-
In each pre-final round of a WIP prover, the size of multi-scalar multiplication is greater by 1 than that for an IP prover. In the last round, 6 (4 and 2) multi-scalar multiplications are additionally required in WIP than IP. Further,
5n + 2log(n) - 2
more scalar multiplications are needed for WIP than IP. -
For WIP and IP verification, 3 additional multiplications (in the multi-scalar multiplication equation) are necessary in WIP than IP.
-
For BP+, prover needs two multi-scalar multiplications of sizes
2n+1, 2n+3
while in BP, 4 multi-scalar multiplications of sizes2n+1, 2n+1, 2, 2
are needed. We cannot, however, directly compute the total difference in number of group operations because Pipenger's algorithm scales asO(n/log(n))
. The number of scalar multiplications are almost thrice in BP than in BP+. -
In BP verification, two multi-scalar multiplications of sizes
n,6
are more than that of BP+. The scalar multiplications are almost twice for BP than BP+.
Edited by Suyash Bagad-
@suyash67 and @omershlo: Since you're claiming real-world identities for this request, can you please provide verification through other channels that this request does in fact originate from those identities? There was a situation a while back where someone wrote a proposal falsely claiming to be someone they were not, so it seems prudent to confirm this here.
@SarangNoether please let us know if this is good enough: https://github.com/KZen-networks/bulletproofs/pull/23
@suyash67 @omershlo: I revisited some of my earlier estimates comparing verification complexity in Bulletproofs and Bulletproofs+ to see what we might expect for improvements. Since the entirety of a batch verification can be reduced to a single multiscalar multiplication operation, and since scalar-only operations are orders of magnitude faster than scalar-group operations, it seems reasonable to assume that verification time effectively depends only on the size of the multiscalar multiplication. Further, any group element appearing in multiple equations in the batch need only be included once.
The only difference between Bulletproofs and Bulletproofs+ in terms of overall multiscalar multiplication is the three elements
T_1,T_2,Sthat appear in each proof in the batch. Is there another reason why you would expect the time improvement estimates you provided?I wrote out a rolled out version of BP+ similar to what is given in BP paper (both described below) for easier comparison. As far as verification is concerned, you are right that the speed up BP+ could provide is small. A single BP+ verification is a multi-scalar multiplication check of size
2n + 2\textnormal{log}_2(n) + 5while the same for BP is2n + 2\textnormal{log}_2(n) + 7.BP+ verifier algorithm (rolled-out):
Input: Proof
\pi = \left\{ A, \{ L_j, R_j \}_{j=1}^{\textnormal{log}_2n}, A', B', (r', s', \delta') \in \mathbb{Z}_p^3 \right\}- Compute scalar vectors \ \textbf{s}, \textbf{s}' \in \mathbb{Z}_p^{n}such that the generators\textsf{g}, \textsf{h}used in the final round of the protocol can be written as
\textsf{g} = \prod_{i=1}^{n} g_i^{s_i} \in \mathbb{G}, \quad \textsf{h} = \prod_{i=1}^{n} h_i^{s_i^{\prime}} \in \mathbb{G}-
Compute
\hat{A} = A \cdot \textbf{g}^{-z \cdot \textbf{1}^n} \cdot \textbf{h}^{z \cdot \textbf{1}^n + \textbf{2}^n \circ \overleftarrow{y}^n} \cdot V^{y^{n+1}} \cdot g^{-\zeta(y,z)}where\zeta(y,z)= zy^{n+1} \cdot \langle \textbf{2}^n,\textbf{1}^n \rangle + (z^2-z) \cdot \langle \textbf{1}^n,\overrightarrow{y}^n \rangle. -
Check the following equality
\textbf{g}^{e \cdot r' \cdot \textbf{s}} \cdot \textbf{h}^{e \cdot s' \cdot \textbf{s}'} \cdot g^{r' \odot s'} \cdot h^{\delta'} = \hat{A}^{e^2} \cdot \left( \prod_{j=1}^{\textnormal{log}_2n} L_j^{e^2 x_j^{2}} \cdot R_j^{e^2 x_j^{-2}} \right) \cdot A'^{e} \cdot BSubstituting for
\hat{A}, we can write the last check in a single equation of the form\textbf{g}^{e \cdot r' \cdot \textbf{s} + e^2 \cdot z \cdot \textbf{1}^n} \cdot \textbf{h}^{e \cdot s' \cdot \textbf{s}' - e^2 \cdot z \cdot \textbf{1}^n - e^2 \cdot \textbf{2}^n \circ \overleftarrow{y}^n } \cdot g^{r' \odot s' + e^2 \cdot \zeta(y,z)} \cdot h^{\delta'} \cdot V^{-e^2 \cdot y^{n+1}} \cdot \left( \prod_{j=1}^{\textnormal{log}_2n} L_j^{-e^2 x_j^{2}} \cdot R_j^{-e^2 x_j^{-2}} \right) \cdot A'^{-e} \cdot B^{-1} = 1BP verifier algorithm (rolled-out, from page 29 of the BP paper):
Note: Verifying aggregated proofs would be very similar to the above and the difference in sizes of multi-scalar multiplications would remain the same.
Edited by Suyash Bagad- Compute scalar vectors
The reason for the kind of improvements we see in our implementation might be because of the following two reasons:
- The underlying library and core elliptic curve operations would greatly affect the running times.
- The multi-scalar multiplication in our implementation does not leverage Pipenger's algorithm to speed up the running times.
Do your implementations of both constructions use random weighting to produce a single multiscalar multiplication for verification across all equations? Even without the use of Pippenger (or another related algorithm), the small difference in operation size seems like it would not make that much of a difference.
I implemented aggregated BP and BP+ verification using a single multi-scalar multiplication here. Single multi-scalar multiplication based verification is much faster than our earlier version. I also wrote a brief blog explaining the math behind single multi-scalar multiplication verification of BP and BP+.
The difference in verification times is marginal for BP+ than BP (as also discussed in the above thread). Here's a summary of verification timings (simulated on Intel Core i7-5500U CPU running at 2.40GHz):
mBP Verification (ms) BP+ Verification (ms) 1 14.15 14.36 2 27.26 26.91 4 62.81 53.51 8 121.19 109.23 16 279.70 235.26 32 718.34 556.90 The verification time is almost the same for
m=1,2for BP and BP+. We see about a 10% improvement in verification form \ge 4. However, this would reduce on using Pipenger's exponentiation algorithm (which we don't use in the current proof of concept).
mentioned in merge request !148 (merged)
Hi Sarang, We want to submit a new proposal for evaluating, reviewing and auditing Bullettproofs+. To make the new proposal concrete we figured we should wait until the Bulletproofs+ code is implemented. It will be also helpful to learn the timeline for implementation. If you think there is something we can already do now - please advise.
mentioned in merge request !197 (merged)