Skip to content
Snippets Groups Projects

Modularize Monero

Closed Thorsten Kaiser requested to merge DiosDelRayo/ccs-proposals:Modularize_Monero into master
1 unresolved thread

Modularize Monero

About

This proposal aims to modularize the Monero codebase, transforming it from it's current monolithic structure into a more modular codebase which can be cherry-picked and taken apart for different use cases.

Objectives:

(Ordered by importance)

  1. 100% Backward compatibility with the exiting code case. The source base will be transformed that a complete compile will yield still the same outcome and functionality. Where necessary there will be created wrapper code to stay compatible.
  2. Separations of concerns. All functionalities should be separated modules to cherry pick what to use and/or compile. E.g. if you build an hardware wallet, you don't need a huge part of the code, but at the moment it is a very difficult part to separate it. Maybe there even possibilities to run on micro controller an hardware wallet, what at the moment looks already out of scope. XmrSigner on a pi zero armv6@800MHz and 512MB RAM, needs >3 minutes to load or restore a wallet. But there is hope with the findings on separating all in modules, to find better ways to do signing and handling of outputs and key images.
  3. Minimize external dependencies, all cryptographic code should be included in the source code. External dependencies often lead to breaking things later, maybe the domain vanished, or the author wants to make a statement and takes it offline, etc. Cryptographic code anyway almost never changes after established once, and normally the license allows to include the that source code into your own. To make sure that changes because of vulnerabilities/fixes get updated asap the source should be monitored if possible. Other dependencies like boost should be minimized as far as possible, and if not possible to get rid of the dependency at least it should be possible to compile without if the provided functionality is not needed. All code that can be substituted to Qt6/C++ will be substituted with the hope of eliminate things like epee.
  4. Improved readability and understandability. How almost all touching real world value, a new developer should have it easier to understand what is going on, then browsing through a source file with 15K LOC and then jump through external helper functions or complicated constructs. Source should be as simple as possible but not simpler.
  5. Flexible compilation. The source should be restructured in a way that you can cherry-pick which code to import into your project, have it easy to update modules when modules change. Also should it be easy to generate a library where you cherry-pick the needed modules/functionality.
  6. Modular architecture. Redesigning the architecture so that it is possible to exchange modules easily, like want to test a new blockchain back-end, simply inherit from the blockchain interface and write the new module using another database or protocol to retrieve the new blocks....
  7. Comprehensive documentation. Every developer loves to have up to date and complete documentation, and every developer hates to write it. In the transformation the current monolithic source and the new modular source should get a top notch documentation, because before transforming the source one anyway first needs to understand, and so 70% of the documentation work is almost done, only missing to write it down and phrase it in the best possible way. After transforming also the differences and what the code does is pretty clear and therefore is the best moment in time to write. This effort should benefit the current like the new implementation equal.
  8. Enhanced auditability. Through a more broken down and easier to read code base and improved structure and documentation, both current and new source should be easier to audit by entities not familiar with the source yet. (If you audit source you already know, you get also blind to some issues, because they are simply familiar).
  9. Facilitate future translations. The modularized structure with clear interfaces and comprehensive documentation should it end also lead to potential future translations into other languages like Rust, Kotlin, Erlang or Python (the current monero-python library is e.g. almost only a wrapper for the wallet rpc).
  10. Cleaner Code. Somehow a bit redundant to point 2, 3 and 4, only addition to adhere to modern C++ practices and clean code principles.

Non-objectives:

  1. Performance. Of course the code should be as performed as possible, but no objective will be sacrificed for performance, clarity above performance.
  2. Substitute the work of the original authors or diminish the worth of their work. Their work is highly respected and cooperation is searched to tame the monster. It is only natural that the monster grows as long you feed it. And it is logical that you feed the monster if you love it and if it lives in your house. That this source has become a so monolithic beast is almost unavoidable. Of course it would be cool if one day this effort will lead to the core development on the more modularized code base, and for that I will search also current developers opinions while walking the path, but the transformation is not done for that purpose. The current developers have light-years more experience and knowledge in that source - but like normal (I assume) no time to refactor the whole monster - even less while worked on that very source actively.
  3. Another language, introducing new libraries or tool-kits. The very purpose is less of all with the same functionality. And sticking with Qt6/C++ is very intentional, first needs to be a modularization and comprehensive documentation, only after that it can be thought to go a step further IMO.

Challenges

  1. Transforming source which is under current development. Somehow like making a surgery on a person out for a walk, while walking... But I hope I can mitigate this challenge in working on modules, one at the time and have like a recipe from A to B, so if there are not huge changes it should not derail me.
  2. Qt6/C++ newest C++ standards and best practices how I'm a novice on C++. I develop for more then 20 years now, but touched during that 20 years only C and once for a hack I modified C++ code and only recently I started with Qt/C++. So as usual I need to get up to speed while I'm doing. It is always easier (in the beginning) to modify source in a language you are not fit yet then write from ground up until it flips and writing from ground up will be easier. But I still in the phase where I will have to read some books in parallel.
  3. The time effort is impossible to estimate as long the the source is not completely understood, and even then it is hard - but I assume as soon the complete source is understood, almost all the work is also done.

Who

Me, Thor a.k.a vThor a.k.a DiosDelRayo, I'm about to finishing the XmrSigner Monero Signer Resurrection and encountered various difficulties on getting XmrSigner production ready, like main points monero-python is almost only a mere wrapper for wallet RPC, the monero source is (almost) take all and digest or take nothing. So finishing this proposal will be directly beneficial for XmrSigner and probably even lead to a XmrSigner NG dropping a lot of ballast, going directly bare metal and not using an interpreted language like Python, but Qt/C++, Rust or Zig. Since more then 20 years I'm in the IT field, from developing (where I started) over network administration, to secure communication systems. Wrote code in a lot of different languages and prefer always to learn what is needed on the way instead of only making that what I know fitting for the operation. The hammer I know well, but it is not always the best tool to archive anything.

Why it is important for community

  • It would make developing new wallets or even projects not heard about easier for new developer and so attracting more developers creating products for the Monero Ecosystem.
  • It would improve audits of the monero source.
  • Me getting familiar with the monero source would create also the opportunity to help one day on that very source.
  • Possibility of finding hidden bugs, how I will need to read and understand the source code intensively.
  • Maybe the possibility to get parts of monero working on even more restricted/limited hardware like micro controller for hardware wallets or similar products
  • Getting XmrSigner well done, at the moment the state is unsatisfying at best.

Milestones

How it is impossible to predict or estimate anything as long not familiar with the whole source, I prefer to propose to work at least 130h/month for 45 XMR (don't care about the exchange rate because XMR should be the anchor) for 3 months, and very probably writing a new proposal to continue. All what I work above 130h in one month shall go into the next month/milestone, I will give a biweekly report below the proposal in GitLab and commit and push the changes each day (normally I push only after a big junk is finished, but here it makes sense to push daily for accountability reasons).

Results in 3 Milestones of 130 hours for 45 XMR each, in total 390 hours for 135 XMR.

Project Timeline

Project timeline is open how time cannot really be estimated (at least for now, maybe it becomes easier after the third milestone). So it is simply to process as much as is possible per day.

Edited by Thorsten Kaiser

Merge request reports

Approval is optional

Closed by Thorsten KaiserThorsten Kaiser 4 months ago (Aug 31, 2024 2:11pm UTC)

Merge details

  • The changes were not merged into master.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Thorsten Kaiser changed the description

    changed the description

  • Thorsten Kaiser changed the description

    changed the description

    • Separations of concerns. All functionalities should be separated modules to cherry pick what to use and/or compile. E.g. if you build an hardware wallet, you don't need a huge part of the code, but at the moment it is a very difficult part to separate it.

      For the integrator use-case, why can this not be fulfilled by third-party libraries such as monero-oxide (the soon-to-be rebrand of monero-serai)? monero-oxide is a collection of libraries, including one for primitives, one for each ZK proof, one for the various transaction types, one for addresses, one for wallet functionality, one for seeds, etc.

      Other dependencies like boost should be minimized as far as possible, and if not possible to get rid of the dependency at least it should be possible to compile without if the provided functionality is not needed. All code that can be substituted to Qt6/C++ will be substituted with the hope of eliminate things like epee.

      Why minimize boost? While it's incredibly large, I'd assume it's the best bet to minimize usage of epee. Also, why Qt6? I'd much prefer boost (a collection of libraries) to Qt6 (a collection of libraries culminating in a graphical stack, with a variety of associated dependencies, and various states of support across ecosystems). Also, boost isn't optional and Qt6 is.

      Improved readability and understandability. How almost all touching real world value, a new developer should have it easier to understand what is going on, then browsing through a source file with 15K LOC and then jump through external helper functions or complicated constructs. Source should be as simple as possible but not simpler.

      I completely support and agree with this goal. What examples of target files do you have besides wallet2, which has been a long-standing pain point and already has such efforts underway? Can you re-scope your PR to just the most egregious sections instead of the entire codebase?

      Flexible compilation. The source should be restructured in a way that you can cherry-pick which code to import into your project, have it easy to update modules when modules change. Also should it be easy to generate a library where you cherry-pick the needed modules/functionality.

      Modular architecture. Redesigning the architecture so that it is possible to exchange modules easily, like want to test a new blockchain back-end, simply inherit from the blockchain interface and write the new module using another database or protocol to retrieve the new blocks....

      These both seem a restatement of point 2.

      Comprehensive documentation.

      I'd argue this could be its own PR which should be done before this PR. It'd demonstrate understanding of the codebase and establish a track record.

      Enhanced auditability

      I agree with this goal and believe it would follow such an architectural shift.

      Facilitate future translations

      I don't believe this would be achieved. The existing API can be bound as could any new API. Modular code may minimize the size of the library linked, yet not the fact a library is built and linked.

      Cleaner Code

      This is either a byproduct of the other points (as somewhat conceded) or involves not just modularizing, yet rewriting Monero. I legitimately think this entire effort may be better scoped to a new Monero node implementation, as it almost proposes something at that scope, this PR seems quite opinionated, and then it doesn't involve the Monero project (in developers, project management, nor funding) until it's had enough work put in to be notable (demonstrating prior effort, competency, perseverance, etc). That commentary does omit funding for the work, at least at its start, so I don't want to encourage it as the best/sole option. I'm trying to point an option and encourage this be downscoped/focused.

      Another language, introducing new libraries or tool-kits. The very purpose is less of all with the same functionality. And sticking with Qt6/C++ is very intentional

      As I prior noted, Qt6 is optional.

      Challenges ... Qt6/C++ newest C++ standards and best practices how I'm a novice on C++.

      This is a massive effort, explicitly stating its about following C++ best practices, yet you appear to admit you don't have the expertise here.

      It would make developing new wallets or even projects not heard about easier for new developer and so attracting more developers creating products for the Monero Ecosystem.

      I'll repeat my question why monero-oxide cannot serve this use case. I do not want to force my own work down the throats of others, yet monero-oxide (with the FCMP++ effort) will become part of the Monero project to some degree. Presumably, just as there's monero-clsag in the monero-oxide tree, there'll be monero-full-chain-membership-proofs, built to the standards and form of monero-oxide, which is the FCMP API used by Monero proper and a library within monero-oxide. While I never want to say we shouldn't have more options, especially across languages (monero-oxide being in Rust, though it'll likely be an inevitable dependency of any wallet code, even with the modularization proposed here), the argument to build this option must respond to why the existing options are insufficient.

      Me getting familiar with the monero source would create also the opportunity to help one day on that very source.

      This circles back to my note on potentially starting with a CCS for documentation/focused cleanup. I don't believe a proposal this large, with an author who doesn't have extensive experience with the codebase, has any chance of producing results close to the intent.

      Possibility of finding hidden bugs

      I think this happens with every CCS involving a developer for the Monero project and doesn't need to be explicitly stated.

    • First of all, how I stated it, seems 1 hour you wrote this all, I will close this MR, seems you didn't see it because you was writing here. I'm sorry that it have cost your time, and this is also a reason to answer here although I will close this MR. Thank you for your input 😄

      For the integrator use-case, why can this not be fulfilled by third-party libraries such as monero-oxide (the soon-to-be rebrand of monero-serai)? monero-oxide is a collection of libraries, including one for primitives, one for each ZK proof, one for the various transaction types, one for addresses, one for wallet functionality, one for seeds, etc.

      Monero-oxide I have not heard before and monero-serai, I think I was not aware either, but I could not swear, I try to remember if I was looking in Cuprate or monero-serai, I can only remember that it didn't accomplish what I needed back in the time. And I meant to have a feature complete modularized code base with documentation, well I never called it SDK, but this is mostly what I would imagine. Taking the whole or the blocks - without the need to go deep into the source to build something, where you in a perfect world simply have a modular library and a manual. But it would work also out of the box as a complete wallet. I mean really cover the whole spectrum the monero source covers.

      Why minimize boost? While it's incredibly large, I'd assume it's the best bet to minimize usage of epee. Also, why Qt6? I'd much prefer boost (a collection of libraries) to Qt6 (a collection of libraries culminating in a graphical stack, with a variety of associated dependencies, and various states of support across ecosystems). Also, boost isn't optional and Qt6 is.

      How I stated before, I was mostly seeing the wallet2 through the lens of monero-gui and feather, while I was reading the code and I wrongly assumed that the monero source is also Qt/C++, where in fact is is C++ without Qt (if I see it now completely right). And I have seen a pretty complicated statements with epee and boost, what I could do with Qt in just one line with two QString methods. But I was wrong in my thinking that monero anyway uses Qt, while it seems only to monero-gui relevant. But the thought behind is simply to have the least dependencies in the source. Well how the transformation to the modularized code would be complete new thing, based on monero code source there would be no restriction what is optional and what not. But now I will have more time to think about it. How I said, I will close the MR, but the objective does not leave my mind except I would see something what accomplish exactly that, what I wish would exist.

      Improved readability and understandability. How almost all touching real world value, a new developer should have it easier to understand what is going on, then browsing through a source file with 15K LOC and then jump through external helper functions or complicated constructs. Source should be as simple as possible but not simpler.

      I completely support and agree with this goal. What examples of target files do you have besides wallet2, which has been a long-standing pain point and already has such efforts underway? Can you re-scope your PR to just the most egregious sections instead of the entire codebase?

      I have no examples because how I was jumping from wallet2 and wallet_rpc_server_* to implement the encryption and decryption of the key images and figuring out the format of the file and how everything is related I got dizzy. Again I will close this MR, but I still think I will take the whole code base somehow apart to modularize it, but at the moment it should not be a concern. And thank you for supporting 😄

      These both seem a restatement of point 2. I think we have there different views, I mean e.g. as the smallest part to pick not key images, but only export key images and only import key images. But still also not finished the thought process how to archive it in the best way.

      Comprehensive documentation.

      I'd argue this could be its own PR which should be done before this PR. It'd demonstrate understanding of the codebase and establish a track record.

      Disagree on that, how you want to document something without to test, reiterate and check? More how it is source from various authors and new for you in this size, again, only 15K LOC in wallet2 and from there jumping through other parts. Well maybe it is me, but I could not do documentation without throwing the wrench into the gears, too. Probably it would establish a track record, but I wonder who would be willing to go through that pain without any pleasure.

      I agree with this goal and believe it would follow such an architectural shift. Thank you for agreeing on that point, but I don't think you could call it an architectural shift how it is not meant to substitute the existing monero core source, but create an ease to use parts of it.

      I don't believe this would be achieved. The existing API can be bound as could any new API. Modular code may minimize the size of the library linked, yet not the fact a library is built and linked. Was not meaning that, but translating to other languages, like I translated polyseed from the C implementation to Python (not with the native code, but implementing it in Python). And it is a lot easier if you can translate parts instead taking the whole chunk at once. At some point I would probably like to translate parts to Kotlin, Rust or Zed or even Erlang. But if you first need to ddep dive weeks in the code that is no fun at all...

      This is either a byproduct of the other points (as somewhat conceded) or involves not just modularizing, yet rewriting Monero. I legitimately think this entire effort may be better scoped to a new Monero node implementation,

      How would a node implementation would archive that? I there not cuprate?

      Cleaner Code. Somehow a bit redundant to point 2, 3 and 4, only addition to adhere to modern C++ practices and clean code principles.

      as it almost proposes something at that scope, this PR seems quite opinionated, I hope it is not wrong to have an opinion (maybe I don't get it how I'm not an English native speaker)

      and then it doesn't involve the Monero project (in developers, project management, nor funding) until it's had enough work put in to be notable (demonstrating prior effort, competency, perseverance, etc). That commentary does omit funding for the work, at least at its start, so I don't want to encourage it as the best/sole option. I'm trying to point an option and encourage this be downscoped/focused.

      First it was never meant to attack somebody, or talk bad about whoever wrote the code. If you reared, "Somehow a bit redundant to point 2, 3 and 4", and "only addition to adhere to modern C++ practices and clean code principles.". And additionally it is the last point of the objectives: "(Ordered by importance)". Additional I don't understand the part: "and then it doesn't involve the Monero project (in developers, project management, nor funding)", I could understand it if I would propose to chance the monero core source, but not for modularize mainly for 3rd party. If I gave this impression - this was never my intent.

      As I prior noted, Qt6 is optional.

      Yeah got it only after writing the proposal that the core is C++ without Qt, my bad. It is not that I grasp myself on Qt, I was only thinking, if one uses already Qt, for what all the other stuff like boost and epee, and assumed that this are only inside because of history (I assumed, I see now wrongly...that it was in the beginning written in C++ or even C and then migrated over to Qt - but this impression I got only because I was reading in monero-gui and feather code, how they archive stuff and was jumping in the sources files, I had like 20 tabs in vim open, and jumping back and forth).

      Challenges ... Qt6/C++ newest C++ standards and best practices how I'm a novice on C++.

      This is a massive effort, explicitly stating its about following C++ best practices, yet you appear to admit you don't have the expertise here.

      Of course I admit what is obvious, that I have no expertise yet here. I had never a reason before to learn, because I was never the choice for any project I did. "following C++ best practices" is stated as the least priority of all objectives and mentioning that this is an additional challenge. Should I instead wish for the code would be in C, Java, Python, or whatever, the source is C++, so I learn C++, I have lately issues with my eyes and I'm not as you anymore as I would like to be, but I still able to read and learn. Will I ever become a C++ expert? No. A language is for me only a tool to archive something. I take whatever tool what is necessary.

      It would make developing new wallets or even projects not heard about easier for new developer and so attracting more developers creating products for the Monero Ecosystem.

      I'll repeat my question why monero-oxide cannot serve this use case.

      I personally don't think so. Maybe I have not seen it yet. But I don't see, that I can build quick a wallet with it, or quick a offline signing wallet - which I can first take complete and then sort out. Neither have I seen extensive documentation or that it is meant to be feature complete. But I think I see that the most important point is only in my head and not reflected in this proposal and I cause a lot of confusion with it.

      I do not want to force my own work down the throats of others,

      Don't want comment on that at the moment ;)

      yet monero-oxide (with the FCMP++ effort) will become part of the Monero project to some degree. Presumably, just as there's monero-clsag in the monero-oxide tree, there'll be monero-full-chain-membership-proofs, built to the standards and form of monero-oxide, which is the FCMP API used by Monero proper and a library within monero-oxide. While I never want to say we shouldn't have more options, especially across languages (monero-oxide being in Rust, though it'll likely be an inevitable dependency of any wallet code, even with the modularization proposed here)

      The modularized source would of course adapt to be current, and I can't say how I would handle it, pulling the rest of the source to rust, or translating the Rust code to C++ (amost insane :D ), maybe it would become also obsolete - what I don't think, I mean that would be the best case. I wish there would be right now exactly this, what I wanted to do in this proposal.

      , the argument to build this option must respond to why the existing options are insufficient. I think I mentioned that. It expects from a app(lication) developer to get deeper into the source and things he is not really interested (at least while trying to archive something). I should had called that proposal more SDK (I was not aware of if while I was writing the proposal), but it is also only a end goal what can not be reached in 3 months either (at least not by me).

      This circles back to my note on potentially starting with a CCS for documentation/focused cleanup. I don't believe a proposal this large, with an author who doesn't have extensive experience with the codebase, has any chance of producing results close to the intent.

      fair take. But only documentation I would quit after maximum two days, would be like school, can't learn without purpose nothing, but can learn with purpose everything on the way. But the new plan is to build a library for offline signing wallet with C ABI and hope that I can repurpose it to come closer to the end goal :D

      I think this happens with every CCS involving a developer for the Monero project and doesn't need to be explicitly stated.

      Well I was assuming that not much people will go through the hole source, analyzing and refactoring...

      Coming once back to monero-oxide, I mean if I had to choose to learn quick Rust or C++ I would prefer to go Rust, but I think it is important to understand first the complete picture (so feature complete), I can very well also imagine to fill the gaps in monero-oxide, but not at this right moment, getting proficient in C++ and Rust and understanding the monero monster all in once would take me more time per day as I can deliver. But to make any meaning-full I think first I need to swallow the monster entirely, then I can from the modularized source reimplement much easier and without constraints (the first transformation you still think similar as what you are working on (well in reality I can speak only of me). But mostly it is better to think oob how every language has it's own pattern somehow - can't explain, but hope you get what I mean.

      I thank you once more very much for your time and effort to write a comment, it's very appreciated! :) And sorry it was (at the moment) for nothing like I will close this MR and write one focused on XmrSigner, offline signing library with C ABI, and there I would jump already a bit more through the monero source and hope I can reuse it later to come closer to something easy onboarding to build new stuff.

    • Please register or sign in to reply
  • I will close this MR, in favor of the following MR @see !495 (merged)

    Edited by Thorsten Kaiser
Please register or sign in to reply
Loading