Node Operator Bootstrapping Introduction & Guide 🤠

No errors, I checked it twice when running the script. Well, maybe I should try it one more time. Tks

@Jared I’ve been bootstrapping multiple vnodes over the last couple of days and noticed the same problem. The shard bootstraps failed 5 times in a row (each a different node). Been copying directly from one node to another to make it work.

Are there any specific shard that the issue is on?

It should be noted that this is the recommended action to be taken instead of downloading from the server.

It was different shards. Will note the different shards and report back.

I was copying shards from one node to the other on the same server. Only way to keep them from getting slashed in time.

1 Like

it may return Argument list too long error, so I recommend rsync alternatively
https://www.linuxfordevices.com/tutorials/linux/copy-large-number-of-files-terminal

1 Like

Pruning was just run on the bootstrap server as of 12-12-2022. Feel free to re-pull shards as needed to get a lower storage size.

Before pruning:
2022-11-30 22_04_12-Index of _ - Brave

After pruning:
2022-12-12 05_52_31-Index of _ - Brave

Pruning has freed up about 22GB over all! :partying_face: :raised_hands:

*Amounts shown are compressed. The uncompressed size will be larger.

4 Likes

@Jared - afaik, after a node gets slashed, it can get picked for any shard. Assuming that’s still the case, how do I know which shard to bootstrap from?

Is this the preferred bootstrapping sequence post-slashing?

  1. Restart the node and re-stake.
  2. Wait till the node gets assigned a shard and starts syncing with that shard.
  3. Stop the docker instance, delete and bootstrap the selected shard’s data.
  4. After bootstrapping is complete, use option 88 to re-start the docker.

Or is there a more efficient way to bootstrap post slashing?

Yes that sequence looks correct. You can delete out the previous shards data at any time since it will no longer be used (unless you’re lucky and get reassigned to the same shard).

Sure, can’t we hard-link beacon chain data across multiple vnodes? I noticed that @J053’s script creates hard links for the selected shard only. I see no reason not to periodically hard-link the already-downloaded .ldb files from each vnode’s beacon/ directory. Won’t that be the biggest storage saver?

Yes, that is the recommended action to take to save space but only for node operators running more than 1 vNode on their server.

Also bootstrapping and hard-linking are both unrelated. You can’t hard-link the files you don’t have.

Sure, understood. I didn’t mean to mix up the bootstrapping discussion with hard-linking of beacon/shard data.

Do we already have a script to hard-link just the common .ldb files from each vnode’s beacon directory? Afaict, J053’s hard-linking script doesn’t touch the beacon directory.

Also, I wouldn’t want to stop a container if the vnode is a part of the committee. Is there a REST call I can add to a script to check if a vnode in pending state?

Their script does beacon data as well. Beacon data is actually what you want to focus on as it will recover the most SSD space. I only run hard-linking with their script for my beacon data.

The hard-linking script does this all for you. It will check what nodes are currently in committee and skip over those.

Looks like the script didn’t hard-link the beacon data in my first attempt because none of the nodes were beacon-synced. I was using the default constant1.ts.

I finally hard-linked all vnodes’ beacon to the seed vnode on my VPS. Thanks for all the assistance, @jared!

2 Likes

@Jared - after bootstrapping a vnode’s shard and submitting a staking request, I notice that the vnode’s SyncState changes to “LATEST” after 1 epoch but the “Role” remains “SYNCING” for several epochs.

For example:

{
  "Status": "ONLINE",
  "Role": "SYNCING",
  "NextEventMsg": "wait 17 epoch",
  "CommitteeChain": "1",
  "SyncState": "LATEST",
  ...
}

Moreover, the number in “wait XYZ epoch” reduces at a much slower rate than 1.5 hours which I believe it the latest epoch interval. Is that normal?

How long is a node expected to be in this state before it becomes PENDING?

Looks like the newly staked vnode is in the “1/2 pending cycle” wait period. However, I am wondering if the wait period is till 25 epochs. I guess it should be:

(pending pool size / number of nodes taken out of committee role per epoch) / 2

Afaik, number of nodes taken out of committee role per epoch is 5. What’s the approximate pending pool size currently?

The Sync Phase is used to ensure no bad actors are able to quickly spin up multiple nodes all at the same time. Even if a node finishes syncing with the network and is status Latest the node will continue being in Sync Phase until the duration is complete.

You can read more about it here:

Currently, there are 4,697 nodes on the Incognito network. You can track and see this number in real-time on explorer.incognito.org and there is also helpful information like epoch number on mainnet.incognito.org.

There’s a typo when showing the path of the downloaded tar.

Version 2.0 was just released. Users on v1.0 or v1.5 will need to delete out bootstrap.sh and re-pull the newest version.

If you have any questions, comments or run into any issues please leave them below or feel free to PM me.

2 Likes

@Jared, FYI, it looks like there was an update to the bootstrap script to version 2.1, but when run, version 2.1 still SAYS, version 2.0.

2 Likes

If anyone experiences BEACON STALL after running @J053’s script or doing hard links manually then check out this script:

1 Like