Draft: mj part time coding 2022-03
3 unresolved threads
3 unresolved threads
Compare changes
mj-part-time-2022-03.md
0 → 100644
+ 175
− 0
In the last 3 months, I was able to achieve quite a progress with my [tsqsim tool](https://github.com/mj-xmr/tsqsim), which you may discover from my [latest dev report on Reddit](https://www.reddit.com/r/Monero/comments/sfs195/mjs_dev_report_2022_jan/). Briefly speaking, I was able to deliver a minimalistic, yet working version of the simulator for the Monero Researchers, that allows for:
I'd like to continue working on tsqsim to make it more accessible for the most of time. About once in a week I'll pro-actively take a closer look at the opened Pull Requests and review those, where I'm a good match, as well as check if the Continuous Integration works fine. Otherwise I'm always available for the Team per request, whenever they discover a fitting task for me earlier than I discover it myself. This work model has already been practiced with a great success IMHO.
The tsqsim simulator is needed for [OSPEAD](https://ccs.getmonero.org/proposals/Rucknium-OSPEAD-Fortifying-Monero-Against-Statistical-Attack.html), but already more uses [are envisioned](https://github.com/monero-project/meta/issues/651), whenever any kind of predictions, or outlier detections are needed. The simulator is written in such a modular way, that allows to mix various research branches at the same time.
- Rucknium's 2nd request were weekly discrete time steps, in order to cancel out the strong intra-day seasonality. Currently I use the following periods: minutely, 5 min., 15 min., 30 min., 1 h, 2 h, 4 h, 12 h, and finally: 1 day. I will take the liberty of adding the monthly at the same time. Effort: easy.
- The 1st difference transformation of the original series turned out to become essential for Monero, even though I thought it would be just a nice addition. To bring some context: the 1st difference transformation is being done in TSA (Time Series Analysis) in order to remove the trend from the original series, so that the standard prediction models (like ARIMA) can be used. The Monero's transaction volume doesn't trend in the short term, but it does so in the higher time scales, where it obviously trends higher across time. Please see below how the TSA tool, called Autocorrelation Function interprets the original series:
Taking a look at the most important lags, namely the ones to the left, you may see, that their autocorrelation (blue plot) is low, so rather random, and pretty close to be statistically insignificant (within the gray horizontal bands). OTOH, if we apply the 1st difference transformation for the same data, we are left with the following, more promising plot:
As you can see, even though both series appear stationary, the differenced one achieved a better score in this metric (-12.006 for the differenced series vs. -7.492 for original one). On higher time scales, where the trends start to be significant, the discrepancy between the original and differenced series becomes even more apparent.
The feature is actually coded, but still contains a small but nasty bug, where upon the reconstruction of the prediction in the differenced domain (the differences or changes of the volume) back to the original domain (the volume itself), there are some discrepancies at the beginning of the reconstructed series, that distort the further predictions. I have already isolated the problem via the according unit tests [Test 1](https://github.com/mj-xmr/tsqsim/blob/master/tests/test-tsqsim/src/TSXformImplTest.cpp#L291), [Test 2](https://github.com/mj-xmr/tsqsim/blob/master/tests/test-tsqsim/src/TSXformImplTest.cpp#L251), [Test 3](https://github.com/mj-xmr/tsqsim/blob/master/tests/test-tsqsim/src/TSXformImplTest.cpp#L268).