Skip to content
Snippets Groups Projects

add proposal for monero kubernetes operator

Merged utxobr requested to merge cirocosta/ccs-proposals:monero-k8s-operator into master
1 unresolved thread
layout: fr
title: Monero Kubernetes Operator
author: Ciro S. Costa (utxobr)
date: May 3, 2021
amount: 9.89
milestones:
  - name: Proof of concept
    funds: 0
    done: 02 May 2021
    status: finished
  - name: Prototype refactoring, installation improvements and docs
    funds: 2.47
    done:
    status: unfinished
  - name: Support annonimity networks
    funds: 3.71
    done:
    status: unfinished
  - name: Improve observability of nodes
    funds: 3.71
    done:
    status: unfinished
payouts:
  - date:
    amount:
  - date:
    amount:
  - date:
    amount:

Brief Intro

My name is Ciro S. Costa (https://github.com/cirocosta, https://twitter.com/utxobr), I'm currently a staff engineer, having previously being a core contributor to https://concourse-ci.org.

Monero-wise, I've been mostly focused on the networking side of it, having implemented the basics of Levin's handshake in Go (https://github.com/cirocosta/go-monero) with full support for the Portablestorage format, which lets me create some interesting reports on node distribution (see https://twitter.com/utxobr/status/1386458317405540360) by crawling the P2P network.

Problem

tl;dr: there's no good solution for running a large number of monero nodes

For those with more than a machine or two to run Monero nodes (or even miners), there's not a good solution out there for having those up and running in an easy to upgrade fashion.

It's great that folks like Seth provide wonderful guides on how to run Monero nodes (see https://sethsimmons.me/guides/run-a-monero-node-advanced/), and that within the functional tests in the codebase we can tell how to run regtest, but none of that helps with running a larger-scale setup.

Proposal

tl;dr: extend the Kubernetes API via its common extension system to provide semantics that make deploying clusters of monero nodes or miners with ease. See proof of concept at https://github.com/cirocosta/monero-operator

Kubernetes (see what is kubernetes) provides us with this vendor-neutral API for expressing what the desired state should be, and then behind the scenes, having that state achieved (and maintained) through the use of small programs whose whole job is to deal with going from current state to desired state.

Aside from being offered by pretty much every cloud provider (and many VPS offerings out there too) and still remaining not vendor-specific, its API is open for extension, which we can leverage to provide extra functionality that it didn't have before.

By extending the Kubernetes API via the use of Custom Resources, we're able to provide a new semantics for the users of those clusters so that we simplify a lot running, say a few Monero nodes all configured the same across different machines

kind: MoneroNodeSet
apiVersion: utxo.com.br/v1alpha1
metadata:
  name: nodes
spec:
  replicas: 3
  hardAntiAffinity: true
  monerod:
    image: utxobr/monerod:v0.17.2.0     # if testing a release candidate,  then
    args:                               # just bump the image and the operator
      - --public                        # will take care of rolling out, preserving
      - --enable-dns-blocklist          # the data already synced.
      - --enforce-dns-checkpointing
      - --out-peers=1024
      - --in-peers=1024
      - --limit-rate=128000

which could be very useful for businesses like CakeWallet that run sets of full nodes (or literally anyone wanting to run highly-available monerod deployments), but it can be also useful for folks doing research like me, wanting to roll out a regtest network with many peers:

kind: MoneroNetwork
apiVersion: utxo.com.br/v1alpha1
metadata:
  name: regtest
spec:
  replicas: 20

  template:
    spec:
      monerod:
        args:                           # each replica has these args
          - --regtest                   # plus `--add-exclusive-node`
          - --fixed-difficulty=1        # pointing just at the other
                                        # peers, forming a closed net

(^ which under the hood gets materialized in the form of monerod instances pointing one at each other, with volumes attached and everything you'd want for a real setup.)

Naturally, we can do the same for miners, for instance, we can get to run 10 replicas of xmrig against a pool like so:

kind: MoneroMiningNodeSet
apiVersion: utxo.com.br/v1alpha1
metadata:
  name: miners
spec:
  replicas: 10
  hardAntiAffinity: true

  xmrig:
    args:
      - -o
      - cryptonote.social:5556
      - -u
      - 891B5keCnwXN14hA9FoAzGFtaWmcuLjTDT5aRTp65juBLkbNpEhLNfgcBn6aWdGuBqBnSThqMPsGRjWVQadCrhoAT6CnSL3.node-$(id)
      - --tls

and then, if we regret chosing that pool, all it takes is patching the object and under the hood, our extension to Kubernetes takes care of rolling the updates out.

(aside: couple this with horizontal pod autoscaler (HPA) and you don't even need to pre-provision any underlying machines - if your provider supports HPA - as by making use of proper resource reservation, asking for extra replicas would trigger the creation of new machines).

The scope

I currently have a working proof of concept (https://github.com/cirocosta/monero-operator) that implements those three custom resources mentioned above (MoneroMiningNodeSet, MoneroNodeSet, and MoneroNetwork).

This CCS would cover:

  1. boosting the confidence in the codebase by providing more tests to cover edge cases glanced over while building the prototype, as well as improving installation and documentation as a whole

  2. adding support for Tor and I2P so that nodes and networks can be deployed on annonimity networks with a line or two in the yaml while still running the services with high availability

  3. improving the observability of the deployed monerod instances introducing a sidecar to expose monerod metrics for any OpenMetrics consumer (like Prometheus)

As a result, the community will end up with:

  • a Kubernetes extension that lets anyone deploy highly-available monerod (and miners) on any Kubernetes-enabled platform

  • a Go package that they can rely on for interacting with monerod

The structure, milestones, and price.

Working on this during my personal hours, I plan to do the work a few hours a day on the side (with a few healthy periods of break) until completion.

The proposal is structured to be paid along with the delivery of the three points above:

  1. confidence in the codebase + installation/doc guides: ~10Hr
  2. support for Tor and I2P for full nodes and whole networks: ~15Hr
  3. observability of monerod: ~15Hr

Assuming a rate of 61.75$/hr and a current rate of 250 USD/xmr (June 1st, 2021):

deliverable hours usd xmr
1 10 $ 617.5 XMR 2.47
2 15 $ 927.5 XMR 3.71
3 15 $ 927.5 XMR 3.71
Edited by utxobr

Merge request reports

Merged by luigi1111luigi1111 3 years ago (Jul 8, 2021 3:34pm UTC)

Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Author Contributor
    Edited by utxobr
  • Hey, I know all of these words!

    I'm in an industry which leverages containers and Kubernetes extensively and have been operating k8s clusters for years. As Monero grows and more organizations start utilizing it they will definitely evaluate deploying nodes in some scalable manner. This proposal will essentially provide those orgs with a set of tools to deploy Monero nodes/miners in clusters and infrastructure they already have.

    Not super useful for average joe (unless you're running minikube/k3s/etc or cluster on cloud for something), but any org which needs that level of scale could absolutely use something like this instead of creating from scratch. Self-healing and auto-scaling nodes is a must have for enterprise deployments which we will be seeing more and more of as adoption increases.

    Edited by lza_menace
  • Contributor

    I support this initiative but I have 4 points:

    1. I think you have overestimated your hourly salary. Please compare it with other core devs, who declare about half of this.
    2. At the same time it seems to me, that you have underestimated the time, that you'll need to fully implement this, unless it's already near completion?
    3. What about future maintenance? Do you plan to have a separate CCS proposal just for maintenance? If yes, will it still be 100 $/h or less? Do you intend to filly document the feature, so that it can be taken up by others?
    4. Why does it need to be a Go package, when the whole project is consequently built in C++ and some Python? Another language (+ compiler) introduces additional burden of maintenance and will make some C++ devs scratch their heads for some time.
    Edited by mj
    • Author Contributor

      Hey @mj, sorry for the delay, thx for considering it, super valid points!

      • 1 I think you have overestimated your hourly salary. Please compare it with other core devs, who declare about half of this.

      That's indeed a fair point - my base for the 100/hr is essentially what the rate is for a staff engineer in the area that I work at in the geographyI'm based of.

      In all fairness, I was actually very surprised to see the average quote for the work that core devs have been putting in - specially with the appreciation in value for XMR, I think that rate should go higher, and I'm sure the community wouldn't see that as a problem, after all, the core devs put so much extremely valuable work.

      • 2 At the same time it seems to me, that you have underestimated the time, that you'll need to fully implement this, unless it's already near completion?

      That indeed seem low, and there are two reasons for it:

      1. the majority of it has already been implemented in what I marked as "milestone 0 - Proof of concept": anyone wanting to try it out, can already do it. The CCS is then to improve it with the additions mentioned in it.

      2. this is work that I do on a day-to-day basis, working side-by-side with core contributors to both Kubernetes and the machinery to extend it, so I'm fairly familiar with the details to make it work

      • 3 What about future maintenance? Do you plan to have a separate CCS proposal just for maintenance? If yes, will it still be 100 $/h or less? Do you intend to fully document the feature, so that it can be taken up by others?

      I'm so glad you asked about it! I didn't account for it at all (my bad!) but I'm super happy to see your concern regarding future maintainability.

      I think it'd be worth a separate CCS if there's a meaningful set of features/upgrades the community needs that would take more than say, 5hr/week of work. For small changes here and there, or keeping up with pull requests, I wouldn't say a CCS would be needed and I'd be willing to keep up with it.

      With regards to documentation: YES, 100% - that's why in the CCS the first fundable milestone is refactoring + documentation. Having worked in a large OSS project before, the last thing I'd want would be to create something for the community that others can't help / maintain.

      • 4 Why does it need to be a Go package, when the whole project is consequently built in C++ and some Python? Another language (+ compiler) introduces additional burden of maintenance and will make some C++ devs scratch their heads for some time.

      Good point! It's a pragmatical choice - Kubernetes being Go-based, the best maintained libraries out there (i.e., the ones that Kubernetes itself use) are in Go. For instance, despite nginx being C-based, nginx-ingress is in Go. Despite tikv being Rust-based, tikv-operator is in Go (same goes for many other projects).

      At the end of the day, the Kubernetes API drills down to REST API (json for pretty much everything, except some protobuf for core types), and thus projects like kopf exist to provide the base functionality for Python folks, but that's very very far from being the way that most of the community does - when it comes to integrating with it, Go is the way.

      I think that if we were to expect greater contributions to it, going the Python way wouldn't really be the best approach as those with the knowledge in the area would certainly already be folks used to Go (given that all other projects in the area use it).


      Please let me know if there's anything else you'd like me to clarify / something is still unclear. happy to help!

      Edited by utxobr
    • Contributor

      Ad. 1

      That's indeed a fair point - my base for the 100/hr is essentially what the rate is for a staff engineer in the area that I work at in the geographyI'm based of.

      OK, if the rates are so high there and you bring already a lot experience in, then let the Community be the final judges :)

      Ad. 4

      I guess it has to be Go then, and there are reasons behind it, for as long as the feature itself is concerned.

      All my questions are answered. Thanks :)

    • Please register or sign in to reply
  • In all fairness, I was actually very surprised to see the average quote for the work that core devs have been putting in - specially with the appreciation in value for XMR, I think that rate should go higher, and I'm sure the community wouldn't see that as a problem, after all, the core devs put so much extremely valuable work.

    Yeah. IMO, I think the reason folks offer to do work for the lower end of the payscale is because there is a general assumption that monero will increase in value, so its best to get a CCS approved as soon as possible.

    well, thats what I would do if I put in a CCS.

  • utxobr changed the description

    changed the description

  • Author Contributor

    Hey folks!

    An update on the current progress:

    • bugs: 1) now working on kubernetes clusters that have nodes with hugepages enabled, 2) no more trouble w/ incoming connection problems due to source-nat'ing

    • tor: pretty much complete support - automatically generates the hidden service keys in a secret for you, brings up the service ... lets you rotate the credentials and have tor automatically restarted .. all good! (except the code, which is in "prototype mode")

    • i2p: no work started yet aside from trying i2p out; pretty interesting stuff though!

    • observability: now exposing all sorts of statistics for prometheus to collect, and start creating some fancy (useful, I guess?) graphs. see below some examples

    Screen_Shot_2021-06-01_at_6.50.11_PM

    some of the metrics are distributions, so it reports in terms of "quantiles", like

    Screen_Shot_2021-06-01_at_6.51.23_PM

    so far, this is what's being exposed (elipsis on those that are labelled):

    monero_bans 0
    monero_connections{country="AE"} 1
    ...
    monero_connections{country="US"} 28
    monero_connections{country="ZA"} 1
    monero_connections_livetime{quantile="0.25"} 407
    ...
    monero_connections_livetime_sum 112251
    monero_connections_livetime_count 94
    monero_fee_estimate 6664
    monero_height_divergence{quantile="0.25"} 0
    ...
    monero_height_divergence_sum 1.5950052e+07
    monero_height_divergence_count 94
    monero_info_alt_blocks_count 0
    monero_info_block_size_limit 600000
    monero_info_block_size_median 300000
    monero_info_busy_syncing 0
    monero_info_cumulative_difficulty 1.0994022525902091e+17
    monero_info_difficulty 2.88813345073e+11
    monero_info_free_space 6.52636684288e+11
    monero_info_grey_peerlist_size 4997
    monero_info_height 2.373955e+06
    monero_info_height_without_bootstrap 2.373955e+06
    monero_info_incoming_connections_count 62
    monero_info_mainnet 1
    monero_info_offline 0
    monero_info_outgoing_connections_count 32
    monero_info_rpc_connections_count 8
    monero_info_stagenet 0
    monero_info_start_time 1.622584598e+09
    monero_info_synchronized 1
    monero_info_target 120
    monero_info_target_height 0
    monero_info_testnet 0
    monero_info_tx_count 1.3970344e+07
    monero_info_tx_pool_size 10
    monero_info_untrusted 0
    monero_info_was_bootstrap_ever_used 0
    monero_info_white_peerlist_size 1000
    monero_last_block_header_block_size 155671
    monero_last_block_header_block_weight 155671
    monero_last_block_header_cumulative_difficulty 1.0994022525902091e+17
    monero_last_block_header_cumulative_difficulty_top64 0
    monero_last_block_header_depth 0
    monero_last_block_header_difficulty 2.90231769832e+11
    monero_last_block_header_difficulty_top64 0
    monero_last_block_header_height 2.373954e+06
    monero_last_block_header_long_term_weight 155671
    monero_last_block_header_major_version 14
    monero_last_block_header_minor_version 14
    monero_last_block_header_nonce 35685
    monero_last_block_header_num_txes 80
    monero_last_block_header_orphan_status 0
    monero_last_block_header_reward 1.032122444405e+12
    monero_last_block_header_timestamp 1.622588025e+09
    monero_last_block_txn_fee{quantile="0.25"} 9.71e+06
    monero_last_block_txn_fee{quantile="0.5"} 1.311e+07
    monero_last_block_txn_fee{quantile="0.75"} 1.548e+07
    monero_last_block_txn_fee{quantile="0.9"} 7.638e+07
    monero_last_block_txn_fee{quantile="0.95"} 3.2837e+08
    monero_last_block_txn_fee{quantile="0.99"} 1.310143e+10
    monero_last_block_txn_fee{quantile="1"} 1.652672e+10
    monero_last_block_txn_fee_sum 3.241189e+10
    monero_last_block_txn_fee_count 80
    monero_last_block_txn_size{quantile="0.25"} 2908
    monero_last_block_txn_size{quantile="0.5"} 3926
    monero_last_block_txn_size{quantile="0.75"} 3936
    monero_last_block_txn_size{quantile="0.9"} 3944
    monero_last_block_txn_size{quantile="0.95"} 4964
    monero_last_block_txn_size{quantile="0.99"} 7404
    monero_last_block_txn_size{quantile="1"} 9066
    monero_last_block_txn_size_sum 294204
    monero_last_block_txn_size_count 80
    monero_last_block_vin{quantile="0.25"} 1
    ...
    monero_last_block_vin_sum 133
    monero_last_block_vin_count 80
    monero_last_block_vout{quantile="0.25"} 2
    ...
    monero_last_block_vout_sum 188
    monero_last_block_vout_count 80
    monero_mempool_bytes_max 1970
    monero_mempool_bytes_med 1454
    monero_mempool_bytes_min 1451
    monero_mempool_bytes_total 15564
    monero_mempool_fee_total 4.8281e+08
    monero_mempool_histo_98pc 0
    monero_mempool_num_10m 0
    monero_mempool_num_double_spends 0
    monero_mempool_num_failing 0
    monero_mempool_num_not_relayed 0
    monero_mempool_oldest 1.622588025e+09
    monero_mempool_txs_total 10
    monero_net_total_in_bytes 1.04265075e+08
    monero_net_total_out_bytes 1.980992005e+09
    monero_peers_new{country="AE"} 2
    monero_peers_new{country="ZA"} 3
    ...
    monero_rpc_count{method="get_bans"} 239
    monero_rpc_count{method="get_base_fee_estimate"} 239
    monero_rpc_count{method="get_block"} 239
    monero_rpc_count{method="get_blocks"} 70
    monero_rpc_count{method="get_connections"} 717
    monero_rpc_count{method="get_height"} 10
    monero_rpc_count{method="get_info"} 289
    monero_rpc_count{method="get_last_block_header"} 478
    monero_rpc_count{method="get_net_stats"} 239
    monero_rpc_count{method="get_peer_list"} 239
    monero_rpc_count{method="get_transaction_pool_hashes"} 70
    monero_rpc_count{method="get_transaction_pool_stats"} 239
    monero_rpc_count{method="get_transactions"} 291
    monero_rpc_count{method="get_version"} 1
    monero_rpc_count{method="getblockcount"} 239
    monero_rpc_count{method="rpc_access_tracking"} 177
    monero_rpc_time{method="get_bans"} 1.635975e+06
    monero_rpc_time{method="get_base_fee_estimate"} 3.5600999e+07
    monero_rpc_time{method="get_block"} 4.0346812e+07
    monero_rpc_time{method="get_blocks"} 2.732984e+06
    monero_rpc_time{method="get_connections"} 3.264220388e+09
    monero_rpc_time{method="get_height"} 198018
    monero_rpc_time{method="get_info"} 1.5722896366e+10
    monero_rpc_time{method="get_last_block_header"} 1.08118982e+08
    monero_rpc_time{method="get_net_stats"} 856480
    monero_rpc_time{method="get_peer_list"} 1.392681159e+09
    monero_rpc_time{method="get_transaction_pool_hashes"} 9.761534e+06
    monero_rpc_time{method="get_transaction_pool_stats"} 1.425162068e+10
    monero_rpc_time{method="get_transactions"} 3.855662833e+09
    monero_rpc_time{method="get_version"} 7904
    monero_rpc_time{method="getblockcount"} 2.100916e+06
    monero_rpc_time{method="rpc_access_tracking"} 3.746814e+06

    also, "$ per/hour sale!!!", updating the issue to keep same XMR at this current $250 XMR :sweat_smile: jokes aside, i'll keep you all posted.

    have a good one everyone!

    Edited by utxobr
  • utxobr changed the description

    changed the description

  • On the salary, I think monero contributors should expect to be paid what they would normally receive working anywhere else professionally. Not everyone can wait for price to appreciate. It's the community that decides whether the compensation is worth the work or if it's been overestimated. Contributing to monero should be competitive enough to attract contributors away from their IRL careers and come work for monero full/part time.

    just my 2 piconero

  • luigi1111 mentioned in commit 8d961edc

    mentioned in commit 8d961edc

  • merged

  • Any updates on this? I'm totally out of the loop on this one (just watched the youtube video demo, great project)

Please register or sign in to reply
Loading