Skip to content
Snippets Groups Projects

Compilation time reduction and housekeeping

Merged mj requested to merge mj/ccs-proposals:compil-time-reduction-and-housekeeping into master
+ 21
27
@@ -3,18 +3,14 @@ layout: fr
title: Compilation time reduction and housekeeping
author: mj
date: April 15, 2020
amount: 59.6 XMR
amount: 52.9 XMR
milestones:
- name: ccache for CMake (demo)
funds: 0 XMR
done:
status: finished
- name: Dynamic linkage
funds: 1.1 XMR
done:
status: unfinished
- name: CLang-9/10 compilation for advanced analysis
funds: 1.5 XMR
funds: 0.9 XMR
done:
status: unfinished
- name: Automated reports of ClangBuildAnalyser and iwyy
@@ -30,7 +26,7 @@ milestones:
done:
status: unfinished
- name: Moving boost headers out of own headers 1/3
funds: 10 XMR
funds: 5 XMR
done:
status: unfinished
- name: Moving boost headers out of own headers 2/3
@@ -114,7 +110,7 @@ It becomes obvious, that the wallet2.h is the largest "hot spot" of the whole pr
# Milestones
What can be done with this is creating as many wrappers of the boost library as possible and putting as much as implementation code into .cpp files, instead of insisting on writing them inline, when these spots aren't bottlenecks. Putting a trivial method as inline may help, but only when it's called very very frequently, and only if that improvement is a large percentage of other parts of the software, which it usually isn't. Inlining has to be proven by profiling the software, and not being a default policy, since it brings nothing, while costing a lot, not only because multiple recompiles of the code in .cpp files in one session, but recompiles upon changes of the inlined implementation.
I'd like to earn something like 50$/h. It's hard now to assess how much time it will take, so I'm not strict on the concrete values. If I happen to finish ahead of time, thus becoming overpaid, I will admit it. I will be writing the time of work in each of my PRs.
I'd like to earn something like 40$/h. It's hard now to assess how much time it will take, so I'm not strict on the concrete values. If I happen to finish ahead of time, thus becoming overpaid, I will admit it. I will be writing the time of work in each of my PRs.
By assessing the payments, I will now assume a price of XMR of 125$.
@@ -125,45 +121,43 @@ Previous text:
## Milestone 2: Dynamic linkage
Static libraries tend to grow in sizes exponentially and slow down the generation of the final binaries. I would like to enable (opt-in) dynamic linkage in CMake for development purposes. Also whenever you are done writing a new test and would like to just modify the production code and just execute the test, the test binary can be made so, that it doesn't have to relink upon change of the production code's resulting shared library.
This is quite a low hanging fruit. There are 70 CMakeLists.txt, so multiplying each one by 2 minutes gives 2.33h plus 0.30h for creating some framework for further such changes gives 2.83h, and that equals to 1.13 XMR.
This is quite a low hanging fruit. There are 70 CMakeLists.txt, so multiplying each one by 2 minutes gives 2.33h plus 0.30h for creating some framework for further such changes gives 2.83h, and that equals to 0.9 XMR.
## Milestone 3: CLang-9/10 compilation for advanced analysis
Monero can't be currently compiled with CLang. If it were, some advanced tools could be employed, that help in dynamic assessment of the quality of the code from many perspectives. For my purpose, I could use (ClangBuildAnalyzer)[https://github.com/aras-p/ClangBuildAnalyzer], which gives an objective truth about which parts of the code take longest time to compile. There's also CLang-based (Include What You Use)[https://include-what-you-use.org/] tool, which not only gives advice how to optimize the bottlenecks, but also tries to do it automatically.
## Milestone 3: Automated reports of ClangBuildAnalyser and iwyy
Some advanced tools can be employed, that help in dynamic assessment of the quality of the code from many perspectives. For my purpose, I could use (ClangBuildAnalyzer)[https://github.com/aras-p/ClangBuildAnalyzer], which gives an objective truth about which parts of the code take longest time to compile. There's also CLang-based (Include What You Use)[https://include-what-you-use.org/] tool, which not only gives advice how to optimize the bottlenecks, but also tries to do it automatically (however it's better to use it just as a hint).
And last but not least, there exist tools, which help finding dangerous constructs in the code, which could lead to segmentation faults at runtime.
## Milestone 4: Automated reports of ClangBuildAnalyser and iwyy
I'd like to use my low powered PCs to generate a daily report of the CLang tools and publish them to a dedicated webpage, that I'd manage. I will of course contribute the scripts, that generate the reports into the Monero source tree. Setting up the tools will take some time.
## Milestone 5: Automated reports of Valgrind (test bottlenecks)
## Milestone 4: Automated reports of Valgrind (test bottlenecks)
Similar as above, however done weekly, since this will take more time. The context is somewhat different here however. Valgrind is able to perform performance tests, able to catch new bottlenecks by executing the tests. I think it would be benefitial, if such reports were available for the public, since their generation costs plenty of time.
This task is somewhat easier, but I'd just like to get compensated for the power costs on this one, so I think that 1 XMR should be fair.
## Milestone 6: Optional precompiled headers for latest cmake
## Milestone 5: Optional precompiled headers for latest cmake
There will surely occur a situation, when a boost header cannot be reasonably wrapped, because it is used in a template code. Such headers are best handled by precompiled headers, reducing the compilation time by up to 50% per precompiled headed. CMake 3.16 is able to generate them natively. Since some users will still be using older versions of CMake, this has to remain optional. I will start with this one before moving the headers away, as this is a low hanging fruit, delivered by CMake devs.
## Milestone 7: Moving boost headers out of own headers 1/3
## Milestone 6: Moving boost headers out of own headers 1/3
If the compilation is to be done faster, all of the 3rd party large headers have to be moved outside from our headers, thus preventing them to be propagated into files, that don't need them and waste time on parsing them. This can be done via forward declarations and careful analysis of the dependency tree.
My such header analysis shows, that there are currently 369 occurrences of boost headers. Since each compilation costs 8.5 minutes and each change 2.5 minutes, we are at 11/60 * 369 = 67.65h of active work, excluding time of testing and verifying the speed improvement (passive work). This leaves us with 27 XMR for the active work. Let's round it up to 30 because of uncertainty and required passive work, as well as power costs. This forces me to split the task into 3 parts for simplicity. But as before, if I'm done earlier that I calculated, I will admit this and will report the work time for each PR.
My such header analysis shows, that there are currently 369 occurrences of boost headers. Since each compilation costs 8.5 minutes and each change 2.5 minutes, we are at 11/60 * 369 = 67.65h of active work, excluding time of testing and verifying the speed improvement (passive work). This leaves us with 21.6 XMR for the active work. Let's round it up to 25 because of uncertainty and required passive work, as well as power costs. This forces me to split the task into 3 parts for simplicity. But as before, if I'm done earlier that I calculated, I will admit this and will report the work time for each PR.
## Milestone 8: Moving boost headers out of own headers 2/3
See milestone 7.
## Milestone 7: Moving boost headers out of own headers 2/3
See milestone 6.
## Milestone 9: Moving boost headers out of own headers 3/3
See milestone 7.
## Milestone 8: Moving boost headers out of own headers 3/3
See milestone 6.
## Milestone 10: Forward declarations of own classes + interfaces
## Milestone 9: Forward declarations of own classes + interfaces
It will be of a lot help, if abstractions (interfaces) were used instead of concrete implementations. Then you can easily share just the forward declarations of the unused parts of the interface for the client using the i-face, and include only these parts, which are needed. It can be achieved quite easily by creating and returning a unique pointer to an object of an implementation within a static function of the interface.
There are 358 .cpp files, and definitely more classes than that. If I were to start from the "hottest" 50 classes first, to achieve largest results at the beggining, I'd need 20 hours, assuming 15 minutes of active work on a class and 8.5 minutes of compilation time ((8.5+15)/60 * 50 = 19.58). This would equate to 7.8 XMR. Rounding up for the power costs, let's say 8 XMR.
There are 358 .cpp files, and definitely more classes than that. If I were to start from the "hottest" 50 classes first, to achieve largest results at the beggining, I'd need 20 hours, assuming 15 minutes of active work on a class and 8.5 minutes of compilation time ((8.5+15)/60 * 50 = 19.58). This would equate to 6.26 XMR. Rounding up for the power costs, let's say 7 XMR.
## Milestone 11: One class per header
## Milestone 10: One class per header
It also helps reducing the probability of having to recompile a large chunk of sources, if the classes are declared one per header. Better segmentation also helps ccache reuse its cache, if there's better granularity.
Since this is quite a mechanical work, not needing ANY analysis, I'd say 2 XMR would be enough.
## Milestone 12: Parallel tests (ctest -jN)
## Milestone 11: Parallel tests (ctest -jN)
Did you know, that ctest allows for running the tests in parallel, just like make does? The problem is, that if they use the same resources during execution, they might (and in our case they do) affect each other. The task here would be to group the tests, which use the same resources and run them sequentially, while running other similar groups in parallel.
I honestly haven't done any analysis on this one yet (other than proving that it doesn't work yet), as there are other things to do, that's why I'll just shoot in the dark here with 5 XMR, or could leave it for somebody else to do.
## Milestone 13: Static methods of the wallet2
## Milestone 12: Static methods of the wallet2
I'd like to address here the problems mentioned by Endogenic (highlighted at Konferenco)[https://www.youtube.com/watch?v=AsJaMw-3gGE&feature=youtu.be&t=25614] (thanks, Scott Anecito!), namely making the wallet2 as stateless as possible. I propose here 4 XMR, as this is one of the largest classes in the whole project (if not the largest).
## Milestone 13: Proper ordering of headers (general last)
@@ -173,4 +167,4 @@ Shall we make it 3 XMR?
# Expiration date
1 Jan, 2025
\ No newline at end of file
1 Jan, 2025
Loading