Right, might add that running 20210302_1 up to block 1M and then switching over to the latest release works fine. Shard synced all the way up to current block.
I am now running a new test with a full node to check all shards. I have made a new fresh install and running the latest recommended script (How to setup your own node in a blink of an eye). It’s not done yet, but I can tell you it’s not looking good so far. I have blocks with errors on multiple shards. I’ll make a post when all shards are done or stalled.
For those keeping score at home, I am still unable to get shards 0,2 or 6 to sync fully on a validator. I see there’s a new tag 20210622_1 which I’m trying now
Well … at least whatever has been updated fixed the slow sync issue I’ve been (casually) observing for about a week. After an update last week (20210617_1?), one of my pNodes slowed to a crawl on beacon/shard syncing. The pNode was literally in the middle of a sync and saw sync speed instantly drop by ~75%. Was only syncing about ~250,000 blocks per day, if that.
Then whatever change was pushed yesterday broke all my other pNodes, similar to what Devenus observed.
The update today (20210622_1) has restored syncing at a reasonable rate again. The pNode that suddenly couldn’t sync more than ~250,000 in a day, is already up to blockheight ~450,000 in a few hours. Last week that took nearly two full days.
Hopefully the beacon chain syncs will be caught up by tomorrow and I’ll be syncing assigned shard chains thereafter.
So far today – one pNode has started resyncing from scratch … again. Another one has been stalled near the current blockheight for nearly an hour, and is now reporting offline in the Node Monitor. I expect it too will start resyncing from scratch – again – shortly. <SIGH>
update: Yep, the stalled one started over AGAIN.
So at least two nodes started a resync from 0 yesterday, synced up to the current blockheight, then inexplicably stalled near the current blockheight and have now started yet another resync from 0 in a ~24-hour period. RIP monthly ISP bandwidth cap.
I want to share my experience here. I stopped all of my Incognito dockers, and followed 3rd (infura account) and 4th (run.sh script) steps here (How to host a Virtual Node). My vNodes run flawlessly (no stall, no offline) for at least 3 days.
@Josh_Hamon@zes333 My pNodes finally resynced the beacon chain (third time’s the charm, I guess) and have started syncing Shard 0. The Shard 0 blockheight for each is currently above 900,000.
These two are each on 20210622_1:
@abduraman Didn’t need to make changes to scripts or config parameters (not that I could even if I wanted to – these are pNodes).
@Mike_Wagner I agree with you. My experience sharing was not an answer to the concerns about pNodes above. I wrote here since the topic title writes “… Multiple vNodes”.
Yeah, I run multiple nodes on the same infura. I recently changed to a node.js script managing my validators, but I still have an old shell script that runs two nodes on the same machine. It looks a lot like your script, but yours is cleaner, because I didn’t think of looping through array keys instead of values.
i=0
for validator_key in "${validators[@]}"; do
rpc_port=$(($first_rpc_port + i))
node_port=$(($first_node_port + i))
i=$((i+1))
data_dir=${DATA_DIR}/node_$i
echo "Starting inc_validator_$i container on $node_port (RPC $rpc_port)"
set -x
docker run --restart=always --net inc_net -p $node_port:$node_port -p $rpc_port:$rpc_port \
-e NODE_PORT=$node_port -e RPC_PORT=$rpc_port -e BOOTNODE_IP=$bootnode \
-e GETH_NAME=$geth_name -e GETH_PROTOCOL= -e GETH_PORT= -e FULLNODE= \
-e MININGKEY=${validator_key} -e TESTNET=false -e LIMIT_FEE=1 \
-v ${data_dir}:/data -d --name inc_validator_$i incognitochain/incognito-mainnet:${latest_tag}
set +x
done
}
Don’t forget to set the empty -e GETH_PROTOCOL= -e GETH_PORT= because it will append the default values if it’s not set at all and end up with http://https://mainnet.infura.io/v3/a1b2c3...:8545. It’s quite an ugly piece of code with no checks.
I forked this from @mesquka, so I’m not 100% but I think it might have been intended as -it -d? Per a quick search:
"
docker run -it -d --name container_name image_name bash
The above command will create a new container with the specified name from the specified docker image. The container name is optional.
The -i option means that it will be interactive mode (you can enter commands to it)
The -t option gives you a terminal (so that you can use it as if you used ssh to enter the container).
The -d option (daemon mode) keeps the container running in the background.
bash is the command it runs.
"
Though think combining all three options into one flag isn’t an issue here.
I don’t have -e GETH_PROTOCOL= -e GETH_PORT= but will give that a try.
UPDATE: Still seeing if shard syncing will make it past the roadblock but with @fredlee’s change I’m already seeing calls to infura, so I’m hopeful.
UPDATE: On Shard0 I’m past the roadblock by adding in the code suggested above. This is using the forked script I mentioned above and the image from 06/26. Later today I’ll work on trying it with other nodes.