Unfortunately, Geometry just doesn’t have enough Q2 bandwidth to commit to completing the entire project at this time.
I just plan to turn my upcoming hobby time towards helping with data cleaning (by identifying and filtering non-wallet2
transactions, as described above).
Of course that is only a portion of the original planned scope. But I’m hoping that chipping away at my part on the weekends will allow me to make a small but meaningful personal contribution to the OSPEAD effort.
Hello everyone, just a brief update. I've decided to carry out much of this work on my own time, at no cost. To avoid confusion, I'll be closing this CCS for now, but I might reopen it or create a new one if circumstances change. Thank you all again for your valuable input and support :)
The OSPEAD project led by Rucknium will substantially improve user privacy by reducing Monero's statistical attack surface. The project's goal is to overhaul the decoy selection algorithm (DSA) to remedy weaknesses in Monero's privacy model.
To achieve the best fit for OSPEAD parametrization, the analysis code developed by Rucknium needs to be computed over a much larger portion of recent Monero transactions. The number of rings that can be processed is the “sample size” for the statistical analysis, so we desire as many data points as possible to enable Monero's OSPEAD-informed DSA to match the true spend distribution as closely as possible.
The first step to achieving a large sample size is porting the current OSPEAD code written in R (a slow interpreted language) to a fast compiled language like C++ or Rust. However, even with faster code, analyzing the rings one at a time in series will limit the practical sample size. To unlock analysis of 10^5 - 10^7 rings, it is also necessary to parallelize the code and run the analysis on a cluster or high performance workstation which contains dozens of CPU cores. Without these changes, the current R code would take years to run. With these changes the analysis will be executable with several days to weeks of compute time. This unlocks a critical bottleneck for OSPEAD parametrization, which itself is critical for Monero user privacy and Monero's future as a whole.
In addition to efficiency improvements described above, we will increase precision in the final output by pre-filtering transactions with heuristics based on transaction uniformity defects. For Rucknium's analysis, the quality of the results is degraded if the analysis includes rings that were not generated by wallet2
software (the standard reference implementation). To avoid this, we will filter out non-wallet2
transactions by applying a number of fingerprinting techniques developed by Isthmus and Neptune over the last 4 years of Noncesense Research Lab R & D. These include features such as unlock times and fees that would not be produced by standard wallets.
The OSPEAD project has made exciting progress toward its goal of securing the decoy selection algorithm (DSA) against statistical attack, as outlined in Rucknium's original proposal. Given the statistical nature of the work involved, all algorithms and analysis is written in one of the primary programming languages used for statistics work: R. The benefits of R for statistical analysis are many fold, and it is certainly the right language for the job given the statistical focus of this project.
However, the OSPEAD project is currently bottlenecked by computation as there are an enormous number of rings to process from the last 6 months alone. There are certain computationally-heavy components of the code base that, when executed in R, result in the overall analysis taking 3 months to compute 1 week of historical data. While R is a best-in-class language for statistical programming, it is a higher-level language and thus quite slow in execution. Furthermore, the current code is single threaded and only runs on one CPU core. All of this is leading to a prohibitively-expensive computational barrier, which would need years of computation to complete the months of transaction analysis that is now needed to complete the OSPEAD project.
This proposal is designed to solve the challenge posed by the computational barriers described above. Specifically it allows the much faster processing of more data, in order to achieve the most robust OSPEAD results possible. If this project is approved, the following will be delivered to the OSPEAD team and the Monero community.
The overall project to speed up OSPEAD computation will take 5+ full time engineering weeks and require significant investment in computational resources. The breakdown is as follows:
wallet2
transactions: 1.0 weeks FTETotal time: 5.5 weeks FTE + 23,000 CPU hours. We expect the engineering work to be complete by Mar 30, 2023.
Total budget: 190 XMR, paid at the start of work to avoid exchange rate volatility risk.
Mitchell Krawiec-Thayer (Isthmus) and the Geometry Labs engineering team
Dr. Mitchell Krawiec-Thayer is a privacy tech researcher whose Monero contributions have largely focused on empirical transaction tree analysis leveraging statistical heuristics and transaction uniformity defects. Other past Monero work includes fingerprinting the mid-2021 transaction volume anomaly, a statistical analysis of nonce value distribution, an opportunistic study of miner equipment types, and identification of the heuristics that will be used to filter non-wallet2
transactions in this project. Mitchell is the President and Chief Scientist at Geometry Labs and will coordinate this CCS process. The lead engineer on this project will be Cheyenne Atapour, a blockchain research engineer with experience in developing performant systems. He has the experience with compiled languages such as C++ and Rust necessary to optimize the OSPEAD code. Geometry Labs is a blockchain and cryptography research & development team, whose specializations include: blockchain infrastructure, scientific tooling, analytics and observatories, and development of novel cryptographic mechanisms.
Hi everybody, thanks so much for the thoughtful comments and questions. I’ve been sharing brief updates on IRC for MRL meetings, and wanted to drop an update here as well.
Rucknium provided a toy function in R with corresponding tests, and Geometry Labs produced (at no cost) a little demo of the workflow for using C++ in an R setting. You can check it out here: https://github.com/geometry-labs/workflow_demo#readme
This week we plan to study the most severe OSPEAD bottlenecks, to sketch out the optimization & parallelization plans with enough detail to (1) put an accurate time estimate on the final scope, and (2) confirm that we have enough engineering time and bandwidth blocked out to execute the project before moving the CCS forward for funding.
If the final plan looks good in terms of scope and timing, I’ll update the proposal accordingly and add another comment. (I’ll also circle back and answer the questions from above comments)
The OSPEAD project led by Rucknium will substantially improve user privacy by reducing Monero's statistical attack surface. The project's goal is to overhaul the decoy selection algorithm (DSA) to remedy weaknesses in Monero's privacy model.
To achieve the best fit for OSPEAD parametrization, the analysis code developed by Rucknium needs to be computed over a much larger portion of recent Monero transactions. The number of rings that can be processed is the “sample size” for the statistical analysis, so we desire as many data points as possible to enable Monero's OSPEAD-informed DSA to match the true spend distribution as closely as possible.
The first step to achieving a large sample size is porting the current OSPEAD code written in R (a slow interpreted language) to a fast compiled language like C++ or Rust. However, even with faster code, analyzing the rings one at a time in series will limit the practical sample size. To unlock analysis of 10^5 - 10^7 rings, it is also necessary to parallelize the code and run the analysis on a cluster or high performance workstation which contains dozens of CPU cores. Without these changes, the current R code would take years to run. With these changes the analysis will be executable with several days to weeks of compute time. This unlocks a critical bottleneck for OSPEAD parametrization, which itself is critical for Monero user privacy and Monero's future as a whole.
In addition to efficiency improvements described above, we will increase precision in the final output by pre-filtering transactions with heuristics based on transaction uniformity defects. For Rucknium's analysis, the quality of the results is degraded if the analysis includes rings that were not generated by wallet2
software (the standard reference implementation). To avoid this, we will filter out non-wallet2
transactions by applying a number of fingerprinting techniques developed by Isthmus and Neptune over the last 4 years of Noncesense Research Lab R & D. These include features such as unlock times and fees that would not be produced by standard wallets.
The OSPEAD project has made exciting progress toward its goal of securing the decoy selection algorithm (DSA) against statistical attack, as outlined in Rucknium's original proposal. Given the statistical nature of the work involved, all algorithms and analysis is written in one of the primary programming languages used for statistics work: R. The benefits of R for statistical analysis are many fold, and it is certainly the right language for the job given the statistical focus of this project.
However, the OSPEAD project is currently bottlenecked by computation as there are an enormous number of rings to process from the last 6 months alone. There are certain computationally-heavy components of the code base that, when executed in R, result in the overall analysis taking 3 months to compute 1 week of historical data. While R is a best-in-class language for statistical programming, it is a higher-level language and thus quite slow in execution. Furthermore, the current code is single threaded and only runs on one CPU core. All of this is leading to a prohibitively-expensive computational barrier, which would need years of computation to complete the months of transaction analysis that is now needed to complete the OSPEAD project.
This proposal is designed to solve the challenge posed by the computational barriers described above. Specifically it allows the much faster processing of more data, in order to achieve the most robust OSPEAD results possible. If this project is approved, the following will be delivered to the OSPEAD team and the Monero community.
The overall project to speed up OSPEAD computation will take 5+ full time engineering weeks and require significant investment in computational resources. The breakdown is as follows:
wallet2
transactions: 1.0 weeks FTETotal time: 5.5 weeks FTE + 23,000 CPU hours. We expect the engineering work to be complete by Mar 30, 2023.
Total budget: 190 XMR, paid at the start of work to avoid exchange rate volatility risk.
Mitchell Krawiec-Thayer (Isthmus) and the Geometry Labs engineering team
Dr. Mitchell Krawiec-Thayer is a privacy tech researcher whose Monero contributions have largely focused on empirical transaction tree analysis leveraging statistical heuristics and transaction uniformity defects. Other past Monero work includes fingerprinting the mid-2021 transaction volume anomaly, a statistical analysis of nonce value distribution, an opportunistic study of miner equipment types, and identification of the heuristics that will be used to filter non-wallet2
transactions in this project. Mitchell is the President and Chief Scientist at Geometry Labs and will coordinate this CCS process. The lead engineer on this project will be Cheyenne Atapour, a blockchain research engineer with experience in developing performant systems. He has the experience with compiled languages such as C++ and Rust necessary to optimize the OSPEAD code. Geometry Labs is a blockchain and cryptography research & development team, whose specializations include: blockchain infrastructure, scientific tooling, analytics and observatories, and development of novel cryptographic mechanisms.
Mitchell P. Krawiec-Thayer (aab78c21) at 11 Feb 02:04
Upload CCS proposal: computational-work-for-ospead.md
I love this idea, and support the CCS. I have known Anhdres through Monero-related collaborations since 2018, and we worked together on Mastering Monero, so I can attest to both the quality of the work and professionalism. Additionally, Anhdres has demonstrated excellent ability to illustrate Monero concepts very clearly, e.g. I believe this article is easily the most approachable explanation of Monero's accounts/wallets/etc structure: https://anhdres.medium.com/how-moneros-accounts-and-subaddresses-work-in-monerujo-4fa7df0a58e4
Thanks @anhdres, looking forward to the garden :)
Isthmus here, confirming that I have received from Rucknium the documents for the scientific review panel.