Node Backup and Restore

Once you have your node up and running, it's time to prepare for disaster recovery. Should your machine ever have a catastrophic failure due to either hardware or software issues, or even a case of natural disaster, it's best to be prepared for such a situation by making a backup.

When running, a complete node installation along with the database can grow to be multiple gigabytes in size. Having to back up and restore such a large volume of data can be expensive, complicated and time-consuming. Luckily, there is a better way.

Instead of having to back up and restore everything, we need to back up only what is essential, that is, those files that cannot be reconstructed because they are unique to your node. For MetalGo node, unique files are those that identify your node on the network, in other words, files that define your NodeID.

Even if your node is a validator on the network and has multiple delegations on it, you don't need to worry about backing up anything else, because the validation and delegation transactions are also stored on the blockchain and will be restored during bootstrapping, along with the rest of the blockchain data.

The installation itself can be easily recreated by installing the node on a new machine, and all the remaining gigabytes of blockchain data can be easily recreated by the process of bootstrapping, which copies the data over from other network peers. However, if you would like to speed up the process, see the Database Backup and Restore section

NodeID

danger

If more than one running nodes share the same NodeID, the communications from other nodes in the Metal network to this NodeID will be random to one of these nodes. If this NodeID is of a validator, it will dramatically impact the uptime calculation of the validator which will very likely disqualify the validator from receiving the staking rewards. Please make sure only one node with the same NodeID run at one time.

NodeID is a unique identifier that differentiates your node from all the other peers on the network. It's a string formatted like NodeID-5mb46qkSBj81k9g9e4VFjGGSbaaSLFRzD. You can look up the technical background of how the NodeID is constructed here. In essence, NodeID is defined by three files:

staker.crt
staker.key
signer.key

In the default installation, they can be found in the working directory, specifically in ~/.metalgo/staking/. All we need to do to recreate the node on another machine is to run a new installation with those same two files.

caution

If you have users defined in the keystore of your node, then you need to back up and restore those as well. Keystore API has methods that can be used to export and import user keys. Note that Keystore API is used by developers only and not intended for use in production nodes. If you don't know what a keystore API is and have not used it, you don't need to worry about it.

Backup

To back up your node, we need to store staker.crt, staker.key and signer.keyfiles somewhere safe and private, preferably to a different computer, to your private storage in the cloud, a USB stick or similar. Storing them to a couple of different, secure locations increases the safety.

caution

If someone gets a hold of your staker files, they still cannot get to your funds, as they are controlled by the wallet private keys, not by the node. But, they could re-create your node somewhere else, and depending on the circumstances make you lose the staking rewards. So make sure your staker files are secure.

Let's get the staker files off the machine running the node.

From Local Node

If you're running the node locally, on your desktop computer, just navigate to where the files are and copy them somewhere safe.

On a default Linux installation, the path to them will be /home/USERNAME/.metalgo/staking/, where USERNAME needs to be replaced with the actual username running the node. Select and copy the files from there to a backup location. You don't need to stop the node to do that.

From Remote Node Using `scp`

scp is a 'secure copy' command line program, available built-in on Linux and MacOS computers. There is also a Windows version, pscp, as part of the PuTTY package. If using pscp, in the following commands replace each usage of scp with pscp -scp.

To copy the files from the node, you will need to be able to remotely log into the machine. You can use account password, but the secure and recommended way is to use the SSH keys. The procedure for acquiring and setting up SSH keys is highly dependent on your cloud provider and machine configuration.

When you have means of remote login into the machine, you can copy the files over with the following command:

scp -r ubuntu@PUBLICIP:/home/ubuntu/.metalgo/staking ~/metal_backup

This assumes the username on the machine is ubuntu, replace with correct username in both places if it is different. Also, replace PUBLICIP with the actual public IP of the machine. If scp doesn't automatically use your downloaded SSH key, you can point to it manually:

scp -i /path/to/the/key.pem -r ubuntu@PUBLICIP:/home/ubuntu/.metalgo/staking ~/metal_backup

Once executed, this command will create metal_backup directory in you home directory and place staker files in it. You need to store them somewhere safe.

Restore

To restore your node from a backup, we need to do the reverse: restore staker.key and staker.crt from the backup to the working directory of the node.

First, we need to do the usual installation of the node. This will create a new NodeID, which we need to replace. When the node is installed correctly, log into the machine where the node is running and stop it:

sudo systemctl stop metalgo

We're ready to restore the node.

To Local Node

If you're running the node locally, just copy the staker.key and staker.crt files from the backup location into the working directory, which on the default Linux installation will be /home/USERNAME/.metalgo/staking/. Replace USERNAME with the actual username used to run the node.

To Remote Node Using `scp`

Again, the process is just the reverse operation. Using scp we need to copy the staker.crt, staker.key and signer.key files from the backup location into the remote working directory. Assuming the backed up files are located in the directory where the above backup procedure placed them:

scp ~/metal_backup/staker.* ubuntu@PUBLICIP:/home/ubuntu/.metalgo/staking

Or if you need to specify the path to the SSH key:

scp -i /path/to/the/key.pem ~/metal_backup/staker.* ubuntu@PUBLICIP:/home/ubuntu/.metalgo/staking

And again, replace ubuntu with correct username if different, and PUBLICIP with the actual public IP of the machine running the node, as well as the path to the SSH key if used.

Restart the Node and Verify

Once the files have been replaced, log into the machine and start the node using:

sudo systemctl start metalgo

You can now check that the node is restored with the correct NodeID by issuing the getNodeID API call in the same console you ran the previous command:

curl -X POST --data '{
    "jsonrpc":"2.0",
    "id"     :1,
    "method" :"info.getNodeID"
}' -H 'content-type:application/json;' 127.0.0.1:9650/ext/info

You should see your original NodeID. Restore process is done.

Database

Normally, when starting a new node, you can just bootstrap from scratch. However, there are situations when you may prefer to reuse an existing database (ex: preserve keystore records, reduce sync time).

This tutorial will walk you through compressing your node's DB and moving it to another computer using zip and scp.

Database Backup

First, make sure to stop MetalGo, run:

sudo systemctl stop metalgo

danger

You must stop the Metal node before you back up the database otherwise data could become corrupted.

Once the node is stopped, you can zip the database directory to reduce the size of the backup and speed up the transfer using scp:

zip -r metal_db_backup.zip .metalgo/db

Note: It may take > 30 minutes to zip the node's DB.

Next, you can transfer the backup to another machine:

scp -r ubuntu@PUBLICIP:/home/ubuntu/metal_db_backup.zip ~/metal_db_backup.zip

scp -i /path/to/the/key.pem -r ubuntu@PUBLICIP:/home/ubuntu/metal_db_backup.zip ~/metal_db_backup.zip

Once executed, this command will create metal_db_backup.zip directory in you home directory.

Database Restore

This tutorial assumes you have already completed "Database Backup" and have a backup at ~/metal_db_backup.zip.

First, we need to do the usual installation of the node. When the node is installed correctly, log into the machine where the node is running and stop it:

sudo systemctl stop metalgo

danger

You must stop the Metal node before you restore the database otherwise data could become corrupted.

We're ready to restore the database. First, let's move the DB on the existing node (you can remove this old DB later if the restore was successful):

mv .metalgo/db .metalgo/db-old

Next, we'll unzip the backup we moved from another node (this will place the unzipped files in ~/.metalgo/db when the command is run in the home directory):

unzip metal_db_backup.zip

After the database has been restored on a new node, use this command to start the node:

sudo systemctl start metalgo

Node should now be running from the database on the new instance. To check that everything is in order and that node is not bootstrapping from scratch (which would indicate a problem), use:

sudo journalctl -u metalgo -f

The node should be catching up to the network and fetching a small number of blocks before resuming normal operation (all the ones produced from the time when the node was stopped before the backup).

Once the backup has been restored and is working as expected, the zip can be deleted:

rm metal_db_backup.zip

Database Direct Copy

You may be in a situation where you don't have enough disk space to create the archive containing the whole database, so you cannot complete the backup process as described previously.

In that case, you can still migrate your database to a new computer, by using a different approach: direct copy. Instead of creating the archive, moving the archive and unpacking it, we can do all of that on the fly.

To do so, you will need ssh access from the destination machine (where you want the database to end up) to the source machine (where the database currently is). Setting up ssh is the same as explained for scp earlier in the document.

Same as shown previously, you need to stop the node (on both machines):

sudo systemctl stop metalgo

danger

You must stop the Metal node before you back up the database otherwise data could become corrupted.

Then, on the destination machine, change to a directory where you would like to the put the database files, enter the following command:

ssh -i /path/to/the/key.pem ubuntu@PUBLICIP 'tar czf - .metalgo/db' | tar xvzf - -C .

Make sure to replace the correct path to the key, and correct IP of the source machine. This will compress the database, but instead of writing it to a file it will pipe it over ssh directly to destination machine, where it will be decompressed and written to disk. The process can take a long time, make sure it completes before continuing.

After copying is done, all you need to do now is move the database to the correct location on the destination machine. Assuming there is a default MetalGo node installation, we remove the old database and replace it with the new one:

rm -rf ~/.metalgo/db
mv db ~/.metalgo/db

You can now start the node on the destination machine:

sudo systemctl start metalgo

Node should now be running from the copied database. To check that everything is in order and that node is not bootstrapping from scratch (which would indicate a problem), use:

sudo journalctl -u metalgo -f

The node should be catching up to the network and fetching a small number of blocks before resuming normal operation (all the ones produced from the time when the node was stopped before the backup).

Summary

Essential part of securing your node is the backup that enables full and painless restoration of your node. Following this tutorial you can rest easy knowing that should you ever find yourself in a situation where you need to restore your node from scratch, you can easily and quickly do so.

Node Backup and Restore

NodeID​

Backup​

From Local Node​

From Remote Node Using scp​

Restore​

To Local Node​

To Remote Node Using scp​

Restart the Node and Verify​

Database​

Database Backup​

Database Restore​

Database Direct Copy​

Summary​

NodeID

Backup

From Local Node

From Remote Node Using `scp`

Restore

To Local Node

To Remote Node Using `scp`

Restart the Node and Verify

Database

Database Backup

Database Restore

Database Direct Copy

Summary