add proposal for monero kubernetes operator
layout: fr
title: Monero Kubernetes Operator
author: Ciro S. Costa (utxobr)
date: May 3, 2021
amount: 9.89
milestones:
- name: Proof of concept
funds: 0
done: 02 May 2021
status: finished
- name: Prototype refactoring, installation improvements and docs
funds: 2.47
done:
status: unfinished
- name: Support annonimity networks
funds: 3.71
done:
status: unfinished
- name: Improve observability of nodes
funds: 3.71
done:
status: unfinished
payouts:
- date:
amount:
- date:
amount:
- date:
amount:
Brief Intro
My name is Ciro S. Costa (https://github.com/cirocosta, https://twitter.com/utxobr), I'm currently a staff engineer, having previously being a core contributor to https://concourse-ci.org.
Monero-wise, I've been mostly focused on the networking side of it, having implemented the basics of Levin's handshake in Go (https://github.com/cirocosta/go-monero) with full support for the Portablestorage format, which lets me create some interesting reports on node distribution (see https://twitter.com/utxobr/status/1386458317405540360) by crawling the P2P network.
Problem
tl;dr: there's no good solution for running a large number of monero nodes
For those with more than a machine or two to run Monero nodes (or even miners), there's not a good solution out there for having those up and running in an easy to upgrade fashion.
It's great that folks like Seth provide wonderful guides on how to run Monero nodes (see https://sethsimmons.me/guides/run-a-monero-node-advanced/), and that within the functional tests in the codebase we can tell how to run regtest, but none of that helps with running a larger-scale setup.
Proposal
tl;dr: extend the Kubernetes API via its common extension system to provide semantics that make deploying clusters of monero nodes or miners with ease. See proof of concept at https://github.com/cirocosta/monero-operator
Kubernetes (see what is kubernetes) provides us with this vendor-neutral API for expressing what the desired state should be, and then behind the scenes, having that state achieved (and maintained) through the use of small programs whose whole job is to deal with going from current state to desired state.
Aside from being offered by pretty much every cloud provider (and many VPS offerings out there too) and still remaining not vendor-specific, its API is open for extension, which we can leverage to provide extra functionality that it didn't have before.
By extending the Kubernetes API via the use of Custom Resources, we're able to provide a new semantics for the users of those clusters so that we simplify a lot running, say a few Monero nodes all configured the same across different machines
kind: MoneroNodeSet
apiVersion: utxo.com.br/v1alpha1
metadata:
name: nodes
spec:
replicas: 3
hardAntiAffinity: true
monerod:
image: utxobr/monerod:v0.17.2.0 # if testing a release candidate, then
args: # just bump the image and the operator
- --public # will take care of rolling out, preserving
- --enable-dns-blocklist # the data already synced.
- --enforce-dns-checkpointing
- --out-peers=1024
- --in-peers=1024
- --limit-rate=128000
which could be very useful for businesses like CakeWallet that run sets of full nodes (or literally anyone wanting to run highly-available monerod deployments), but it can be also useful for folks doing research like me, wanting to roll out a regtest network with many peers:
kind: MoneroNetwork
apiVersion: utxo.com.br/v1alpha1
metadata:
name: regtest
spec:
replicas: 20
template:
spec:
monerod:
args: # each replica has these args
- --regtest # plus `--add-exclusive-node`
- --fixed-difficulty=1 # pointing just at the other
# peers, forming a closed net
(^ which under the hood gets materialized in the form of monerod
instances
pointing one at each other, with volumes attached and everything you'd want for
a real setup.)
Naturally, we can do the same for miners, for instance, we can get to run 10
replicas of xmrig
against a pool like so:
kind: MoneroMiningNodeSet
apiVersion: utxo.com.br/v1alpha1
metadata:
name: miners
spec:
replicas: 10
hardAntiAffinity: true
xmrig:
args:
- -o
- cryptonote.social:5556
- -u
- 891B5keCnwXN14hA9FoAzGFtaWmcuLjTDT5aRTp65juBLkbNpEhLNfgcBn6aWdGuBqBnSThqMPsGRjWVQadCrhoAT6CnSL3.node-$(id)
- --tls
and then, if we regret chosing that pool, all it takes is patching the object and under the hood, our extension to Kubernetes takes care of rolling the updates out.
(aside: couple this with horizontal pod autoscaler (HPA) and you don't even need to pre-provision any underlying machines - if your provider supports HPA - as by making use of proper resource reservation, asking for extra replicas would trigger the creation of new machines).
The scope
I currently have a working proof of concept
(https://github.com/cirocosta/monero-operator) that implements those three
custom resources mentioned above (MoneroMiningNodeSet
, MoneroNodeSet
, and
MoneroNetwork
).
This CCS would cover:
-
boosting the confidence in the codebase by providing more tests to cover edge cases glanced over while building the prototype, as well as improving installation and documentation as a whole
-
adding support for Tor and I2P so that nodes and networks can be deployed on annonimity networks with a line or two in the yaml while still running the services with high availability
-
improving the observability of the deployed
monerod
instances introducing a sidecar to exposemonerod
metrics for any OpenMetrics consumer (like Prometheus)
As a result, the community will end up with:
-
a Kubernetes extension that lets anyone deploy highly-available
monerod
(and miners) on any Kubernetes-enabled platform -
a Go package that they can rely on for interacting with
monerod
The structure, milestones, and price.
Working on this during my personal hours, I plan to do the work a few hours a day on the side (with a few healthy periods of break) until completion.
The proposal is structured to be paid along with the delivery of the three points above:
- confidence in the codebase + installation/doc guides: ~10Hr
- support for Tor and I2P for full nodes and whole networks: ~15Hr
- observability of
monerod
: ~15Hr
Assuming a rate of 61.75$/hr and a current rate of 250 USD/xmr (June 1st, 2021):
deliverable | hours | usd | xmr |
---|---|---|---|
1 | 10 | $ 617.5 | XMR 2.47 |
2 | 15 | $ 927.5 | XMR 3.71 |
3 | 15 | $ 927.5 | XMR 3.71 |
Merge request reports
Activity
- Edited by utxobr
Hey, I know all of these words!
I'm in an industry which leverages containers and Kubernetes extensively and have been operating k8s clusters for years. As Monero grows and more organizations start utilizing it they will definitely evaluate deploying nodes in some scalable manner. This proposal will essentially provide those orgs with a set of tools to deploy Monero nodes/miners in clusters and infrastructure they already have.
Not super useful for average joe (unless you're running minikube/k3s/etc or cluster on cloud for something), but any org which needs that level of scale could absolutely use something like this instead of creating from scratch. Self-healing and auto-scaling nodes is a must have for enterprise deployments which we will be seeing more and more of as adoption increases.
Edited by lza_menaceI support this initiative but I have 4 points:
- I think you have overestimated your hourly salary. Please compare it with other core devs, who declare about half of this.
- At the same time it seems to me, that you have underestimated the time, that you'll need to fully implement this, unless it's already near completion?
- What about future maintenance? Do you plan to have a separate CCS proposal just for maintenance? If yes, will it still be 100 $/h or less? Do you intend to filly document the feature, so that it can be taken up by others?
- Why does it need to be a
Go
package, when the whole project is consequently built in C++ and some Python? Another language (+ compiler) introduces additional burden of maintenance and will make some C++ devs scratch their heads for some time.
Edited by mjHey @mj, sorry for the delay, thx for considering it, super valid points!
-
1 I think you have overestimated your hourly salary. Please compare it with other core devs, who declare about half of this.
That's indeed a fair point - my base for the 100/hr is essentially what the rate is for a staff engineer in the area that I work at in the geographyI'm based of.
In all fairness, I was actually very surprised to see the average quote for the work that core devs have been putting in - specially with the appreciation in value for XMR, I think that rate should go higher, and I'm sure the community wouldn't see that as a problem, after all, the core devs put so much extremely valuable work.
-
2 At the same time it seems to me, that you have underestimated the time, that you'll need to fully implement this, unless it's already near completion?
That indeed seem low, and there are two reasons for it:
-
the majority of it has already been implemented in what I marked as "milestone 0 - Proof of concept": anyone wanting to try it out, can already do it. The CCS is then to improve it with the additions mentioned in it.
-
this is work that I do on a day-to-day basis, working side-by-side with core contributors to both Kubernetes and the machinery to extend it, so I'm fairly familiar with the details to make it work
-
3 What about future maintenance? Do you plan to have a separate CCS proposal just for maintenance? If yes, will it still be 100 $/h or less? Do you intend to fully document the feature, so that it can be taken up by others?
I'm so glad you asked about it! I didn't account for it at all (my bad!) but I'm super happy to see your concern regarding future maintainability.
I think it'd be worth a separate CCS if there's a meaningful set of features/upgrades the community needs that would take more than say, 5hr/week of work. For small changes here and there, or keeping up with pull requests, I wouldn't say a CCS would be needed and I'd be willing to keep up with it.
With regards to documentation: YES, 100% - that's why in the CCS the first fundable milestone is refactoring + documentation. Having worked in a large OSS project before, the last thing I'd want would be to create something for the community that others can't help / maintain.
-
4 Why does it need to be a Go package, when the whole project is consequently built in C++ and some Python? Another language (+ compiler) introduces additional burden of maintenance and will make some C++ devs scratch their heads for some time.
Good point! It's a pragmatical choice - Kubernetes being Go-based, the best maintained libraries out there (i.e., the ones that Kubernetes itself use) are in Go. For instance, despite
nginx
being C-based,nginx-ingress
is in Go. Despitetikv
being Rust-based, tikv-operator is in Go (same goes for many other projects).At the end of the day, the Kubernetes API drills down to REST API (json for pretty much everything, except some protobuf for core types), and thus projects like kopf exist to provide the base functionality for Python folks, but that's very very far from being the way that most of the community does - when it comes to integrating with it, Go is the way.
I think that if we were to expect greater contributions to it, going the Python way wouldn't really be the best approach as those with the knowledge in the area would certainly already be folks used to Go (given that all other projects in the area use it).
Please let me know if there's anything else you'd like me to clarify / something is still unclear. happy to help!
Edited by utxobr-
Ad. 1
That's indeed a fair point - my base for the 100/hr is essentially what the rate is for a staff engineer in the area that I work at in the geographyI'm based of.
OK, if the rates are so high there and you bring already a lot experience in, then let the Community be the final judges :)
Ad. 4
I guess it has to be
Go
then, and there are reasons behind it, for as long as the feature itself is concerned.All my questions are answered. Thanks :)
In all fairness, I was actually very surprised to see the average quote for the work that core devs have been putting in - specially with the appreciation in value for XMR, I think that rate should go higher, and I'm sure the community wouldn't see that as a problem, after all, the core devs put so much extremely valuable work.
Yeah. IMO, I think the reason folks offer to do work for the lower end of the payscale is because there is a general assumption that monero will increase in value, so its best to get a CCS approved as soon as possible.
well, thats what I would do if I put in a CCS.
Hey folks!
An update on the current progress:
-
bugs: 1) now working on kubernetes clusters that have nodes with hugepages enabled, 2) no more trouble w/ incoming connection problems due to source-nat'ing
-
tor: pretty much complete support - automatically generates the hidden service keys in a secret for you, brings up the service ... lets you rotate the credentials and have tor automatically restarted .. all good! (except the code, which is in "prototype mode")
-
i2p: no work started yet aside from trying i2p out; pretty interesting stuff though!
-
observability: now exposing all sorts of statistics for prometheus to collect, and start creating some fancy (useful, I guess?) graphs. see below some examples
some of the metrics are distributions, so it reports in terms of "quantiles", like
so far, this is what's being exposed (elipsis on those that are labelled):
monero_bans 0 monero_connections{country="AE"} 1 ... monero_connections{country="US"} 28 monero_connections{country="ZA"} 1 monero_connections_livetime{quantile="0.25"} 407 ... monero_connections_livetime_sum 112251 monero_connections_livetime_count 94 monero_fee_estimate 6664 monero_height_divergence{quantile="0.25"} 0 ... monero_height_divergence_sum 1.5950052e+07 monero_height_divergence_count 94 monero_info_alt_blocks_count 0 monero_info_block_size_limit 600000 monero_info_block_size_median 300000 monero_info_busy_syncing 0 monero_info_cumulative_difficulty 1.0994022525902091e+17 monero_info_difficulty 2.88813345073e+11 monero_info_free_space 6.52636684288e+11 monero_info_grey_peerlist_size 4997 monero_info_height 2.373955e+06 monero_info_height_without_bootstrap 2.373955e+06 monero_info_incoming_connections_count 62 monero_info_mainnet 1 monero_info_offline 0 monero_info_outgoing_connections_count 32 monero_info_rpc_connections_count 8 monero_info_stagenet 0 monero_info_start_time 1.622584598e+09 monero_info_synchronized 1 monero_info_target 120 monero_info_target_height 0 monero_info_testnet 0 monero_info_tx_count 1.3970344e+07 monero_info_tx_pool_size 10 monero_info_untrusted 0 monero_info_was_bootstrap_ever_used 0 monero_info_white_peerlist_size 1000 monero_last_block_header_block_size 155671 monero_last_block_header_block_weight 155671 monero_last_block_header_cumulative_difficulty 1.0994022525902091e+17 monero_last_block_header_cumulative_difficulty_top64 0 monero_last_block_header_depth 0 monero_last_block_header_difficulty 2.90231769832e+11 monero_last_block_header_difficulty_top64 0 monero_last_block_header_height 2.373954e+06 monero_last_block_header_long_term_weight 155671 monero_last_block_header_major_version 14 monero_last_block_header_minor_version 14 monero_last_block_header_nonce 35685 monero_last_block_header_num_txes 80 monero_last_block_header_orphan_status 0 monero_last_block_header_reward 1.032122444405e+12 monero_last_block_header_timestamp 1.622588025e+09 monero_last_block_txn_fee{quantile="0.25"} 9.71e+06 monero_last_block_txn_fee{quantile="0.5"} 1.311e+07 monero_last_block_txn_fee{quantile="0.75"} 1.548e+07 monero_last_block_txn_fee{quantile="0.9"} 7.638e+07 monero_last_block_txn_fee{quantile="0.95"} 3.2837e+08 monero_last_block_txn_fee{quantile="0.99"} 1.310143e+10 monero_last_block_txn_fee{quantile="1"} 1.652672e+10 monero_last_block_txn_fee_sum 3.241189e+10 monero_last_block_txn_fee_count 80 monero_last_block_txn_size{quantile="0.25"} 2908 monero_last_block_txn_size{quantile="0.5"} 3926 monero_last_block_txn_size{quantile="0.75"} 3936 monero_last_block_txn_size{quantile="0.9"} 3944 monero_last_block_txn_size{quantile="0.95"} 4964 monero_last_block_txn_size{quantile="0.99"} 7404 monero_last_block_txn_size{quantile="1"} 9066 monero_last_block_txn_size_sum 294204 monero_last_block_txn_size_count 80 monero_last_block_vin{quantile="0.25"} 1 ... monero_last_block_vin_sum 133 monero_last_block_vin_count 80 monero_last_block_vout{quantile="0.25"} 2 ... monero_last_block_vout_sum 188 monero_last_block_vout_count 80 monero_mempool_bytes_max 1970 monero_mempool_bytes_med 1454 monero_mempool_bytes_min 1451 monero_mempool_bytes_total 15564 monero_mempool_fee_total 4.8281e+08 monero_mempool_histo_98pc 0 monero_mempool_num_10m 0 monero_mempool_num_double_spends 0 monero_mempool_num_failing 0 monero_mempool_num_not_relayed 0 monero_mempool_oldest 1.622588025e+09 monero_mempool_txs_total 10 monero_net_total_in_bytes 1.04265075e+08 monero_net_total_out_bytes 1.980992005e+09 monero_peers_new{country="AE"} 2 monero_peers_new{country="ZA"} 3 ... monero_rpc_count{method="get_bans"} 239 monero_rpc_count{method="get_base_fee_estimate"} 239 monero_rpc_count{method="get_block"} 239 monero_rpc_count{method="get_blocks"} 70 monero_rpc_count{method="get_connections"} 717 monero_rpc_count{method="get_height"} 10 monero_rpc_count{method="get_info"} 289 monero_rpc_count{method="get_last_block_header"} 478 monero_rpc_count{method="get_net_stats"} 239 monero_rpc_count{method="get_peer_list"} 239 monero_rpc_count{method="get_transaction_pool_hashes"} 70 monero_rpc_count{method="get_transaction_pool_stats"} 239 monero_rpc_count{method="get_transactions"} 291 monero_rpc_count{method="get_version"} 1 monero_rpc_count{method="getblockcount"} 239 monero_rpc_count{method="rpc_access_tracking"} 177 monero_rpc_time{method="get_bans"} 1.635975e+06 monero_rpc_time{method="get_base_fee_estimate"} 3.5600999e+07 monero_rpc_time{method="get_block"} 4.0346812e+07 monero_rpc_time{method="get_blocks"} 2.732984e+06 monero_rpc_time{method="get_connections"} 3.264220388e+09 monero_rpc_time{method="get_height"} 198018 monero_rpc_time{method="get_info"} 1.5722896366e+10 monero_rpc_time{method="get_last_block_header"} 1.08118982e+08 monero_rpc_time{method="get_net_stats"} 856480 monero_rpc_time{method="get_peer_list"} 1.392681159e+09 monero_rpc_time{method="get_transaction_pool_hashes"} 9.761534e+06 monero_rpc_time{method="get_transaction_pool_stats"} 1.425162068e+10 monero_rpc_time{method="get_transactions"} 3.855662833e+09 monero_rpc_time{method="get_version"} 7904 monero_rpc_time{method="getblockcount"} 2.100916e+06 monero_rpc_time{method="rpc_access_tracking"} 3.746814e+06
also, "$ per/hour sale!!!", updating the issue to keep same XMR at this current $250 XMR
jokes aside, i'll keep you all posted.have a good one everyone!
Edited by utxobr-
On the salary, I think monero contributors should expect to be paid what they would normally receive working anywhere else professionally. Not everyone can wait for price to appreciate. It's the community that decides whether the compensation is worth the work or if it's been overestimated. Contributing to monero should be competitive enough to attract contributors away from their IRL careers and come work for monero full/part time.
just my 2 piconero
mentioned in commit 8d961edc
Any updates on this? I'm totally out of the loop on this one (just watched the youtube video demo, great project)