How we perform node upgrades before everyone else, without downtime
Recently we were asked how we are able to upgrade to new versions of eosio/nodeos software without losing any uptime:
A good question, and one that we asked ourselves when we were setting up our infrastructure. Here’s what we said:
Current situation as non-rewarded backup BP
As an EOS Block Producer Candidate that is currently not receiving rewards (<0.5%) we have a scaled down infrastructure without load balancing. Still we manage to upgrade very fast, and with very limited downtime due to our mostly automated processes:
- Our backend servers are continuously checking if new eosio versions are available. Once a new version is available, the version gets built and stored in our docker registry automatically. This way, we make sure we always have the latest build available.
- When we have determined that we can start deployment, we change the docker image version for the affected nodes.
- We do a rolling restart of our nodes (this takes about 2 seconds per node). From now on the new version is deployed.
So technically speaking, we are not having an uptime of 100%, but more like 99.99% when we are updating our nodes that day. The polling occurs every 5 minutes, which often results in an uptime of 100% if our nodes are just down for a few seconds.
Having 99.99% uptime is fair for a BP with limited resources. At this moment our priority is to get more votes. Once we have enough votes (0.5%) to fund our operations we will lift our uptime percentage. If you like the idea, please vote for us (dutcheosxxxx) and let’s raise the standard!
There’s actually many more reasons to vote for us: