Shard 2 Stalling at 433077 on Multiple vNodes

I have more than one vNode stalling at the same place on Shard 2. Stopping the docker container and rerunning the script has no effect. No update was found, which I take to mean they are up to date. Is there something else I can do to be a good validator?

I’m using less than 50% of HD, RAM and CPU are <%100 utilized.

Hi Josh,

please stop your node and give us the log where error happened
also give us the docker tag that you are running.
have you tried to wipe data and resync?

Docker tag is 20210516_2 in all cases

vNode 1 error
2021-05-21 16:20:11.651 utils.go:111 [ERR] Syncker log : Insert block 433078 hash db4f7127444d9425f0789d7e4e6b63102b5c31f69c824d25af5491d2897874ea got error -1051: Instruction Hash Error 
 Expect instruction hash to be 5d809a2ce79370c64f06dc8bb8ddb82335367d447474d36790cfb13828478a1f but get 0000000000000000000000000000000000000000000000000000000000000000 at block 433078 hash db4f7127444d9425f0789d7e4e6b63102b5c31f69c824d25af5491d2897874ea
Instruction Hash Error
github.com/incognitochain/incognito-chain/blockchain.NewBlockChainError
	/Users/autonomous/projects/incognito-chain/blockchain/error.go:385
github.com/incognitochain/incognito-chain/blockchain.(*BlockChain).verifyPreProcessingShardBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardprocess.go:384
github.com/incognitochain/incognito-chain/blockchain.(*BlockChain).InsertShardBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardprocess.go:190
github.com/incognitochain/incognito-chain/blockchain.(*ShardChain).InsertBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardchain.go:258
github.com/incognitochain/incognito-chain/syncker.InsertBatchBlock
	/Users/autonomous/projects/incognito-chain/syncker/utils.go:105
github.com/incognitochain/incognito-chain/syncker.(*ShardSyncProcess).streamFromPeer
	/Users/autonomous/projects/incognito-chain/syncker/shardsyncprocess.go:300
github.com/incognitochain/incognito-chain/syncker.(*ShardSyncProcess).syncShardProcess
	/Users/autonomous/projects/incognito-chain/syncker/shardsyncprocess.go:204
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1357, Committee of epoch [121VhftSAygpEJZ6i9jGk6fYNp3yQVGCTr4kLCN7J2M6gxZtg4882aNoKif7xYH9VYVNYfZ8BrLa9Wna2YonB9xen9aK6fBJwpF6wexC6WSrZnJpgvGkpNzrHvUQer8qzjZLvPs1mYBPfiFZQ9QD7Wf5aMGpqjWt8SkUZcfeqRRNzgQspHskBGkXV2EvceWbPYTGUHawCxn9i2KdYkPHETJNcPYQdK1v6a1gBUvCmeWesprkDhXF9Dnxi18gQS8KRdJoFTBKPFKJAphYUjZMUApM5xArDauw5ZCdzZiuEPsgnHLf1KHb28r2mrrsgjw4GwPomgGSdAkAd743DEz42rs65dsL1aaZyQPcVT9tTv6CCtP8iKQBQaeps58hWtZn38AKbZ1RGumBaedAD4qFaFgDASKZvREpYBSw2C345GsY9Vgx 

Couldn’t find a similar error for vNode 2

vNode 3 I’ll try wiping data0/…/shard2

vNode 3, started syncing and stalled again at the same spot

Getting the exact same problem on that shard and block. I also tried wiping shard data, but it does not help. I am trying a wipe of the whole data folder now to see if it makes any difference.

2021-05-25 08:50:25.685 utils.go:111 [ERR] Syncker log : Insert block 433078 hash db4f7127444d9425f0789d7e4e6b63102b5c31f69c824d25af5491d2897874ea got error -1051: Instruction Hash Error

Expect instruction hash to be 5d809a2ce79370c64f06dc8bb8ddb82335367d447474d36790cfb13828478a1f but get 0000000000000000000000000000000000000000000000000000000000000000 at block 433078 hash db4f7127444d9425f0789d7e4e6b63102b5c31f69c824d25af5491d2897874ea
Instruction Hash Error
github.com/incognitochain/incognito-chain/blockchain.NewBlockChainError
/Users/autonomous/projects/incognito-chain/blockchain/error.go:385
github.com/incognitochain/incognito-chain/blockchain (*BlockChain).verifyPreProcessingShardBlock
/Users/autonomous/projects/incognito-chain/blockchain/shardprocess.go:384
github.com/incognitochain/incognito-chain/blockchain.(*BlockChain).InsertShardBlock
/Users/autonomous/projects/incognito-chain/blockchain/shardprocess.go:190
github.com/incognitochain/incognito-chain/blockchain.(*ShardChain).InsertBlock
/Users/autonomous/projects/incognito-chain/blockchain/shardchain.go:258
github.com/incognitochain/incognito-chain/syncker.InsertBatchBlock
/Users/autonomous/projects/incognito-chain/syncker/utils.go:105
github.com/incognitochain/incognito-chain/syncker.(*ShardSyncProcess).streamFromPeer
/Users/autonomous/projects/incognito-chain/syncker/shardsyncprocess.go:300
github.com/incognitochain/incognito-chain/syncker.(*ShardSyncProcess).syncShardProcess
/Users/autonomous/projects/incognito-chain/syncker/shardsyncprocess.go:204
runtime.goexit
1 Like

Removed all data, 12 hours later, sync is done on Beacon and Shard 2 up to 433077 again. Boom! Same error. Clean sync.

2021-05-25 21:50:29.643 shardproducer.go:895 [ERR] BlockChain log: Build Request Action Error -1006: Build request action error%!!(MISSING)(EXTRA []interface {}=[]) -1007: Verify proof and parse receipt%!!(MISSING)(EXTRA []interface {}=[]) invalid character 'i' looking for beginning of value
Verify proof and parse receipt
github.com/incognitochain/incognito-chain/metadata.NewMetadataTxError
	/Users/autonomous/projects/incognito-chain/metadata/error.go:158
github.com/incognitochain/incognito-chain/metadata.(*IssuingETHRequest).verifyProofAndParseReceipt
	/Users/autonomous/projects/incognito-chain/metadata/issuingethrequest.go:198
github.com/incognitochain/incognito-chain/metadata.(*IssuingETHRequest).BuildReqActions
	/Users/autonomous/projects/incognito-chain/metadata/issuingethrequest.go:164
github.com/incognitochain/incognito-chain/blockchain.CreateShardInstructionsFromTransactionAndInstruction
	/Users/autonomous/projects/incognito-chain/blockchain/shardproducer.go:890
github.com/incognitochain/incognito-chain/blockchain.(*BlockChain).verifyPreProcessingShardBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardprocess.go:369
github.com/incognitochain/incognito-chain/blockchain.(*BlockChain).InsertShardBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardprocess.go:190
github.com/incognitochain/incognito-chain/blockchain.(*ShardChain).InsertBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardchain.go:258
github.com/incognitochain/incognito-chain/syncker.InsertBatchBlock
	/Users/autonomous/projects/incognito-chain/syncker/utils.go:105
github.com/incognitochain/incognito-chain/syncker.(*ShardSyncProcess).streamFromPeer
	/Users/autonomous/projects/incognito-chain/syncker/shardsyncprocess.go:300
github.com/incognitochain/incognito-chain/syncker.(*ShardSyncProcess).syncShardProcess
	/Users/autonomous/projects/incognito-chain/syncker/shardsyncprocess.go:204
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1357
Build request action error
github.com/incognitochain/incognito-chain/metadata.NewMetadataTxError
	/Users/autonomous/projects/incognito-chain/metadata/error.go:158
github.com/incognitochain/incognito-chain/metadata.(*IssuingETHRequest).BuildReqActions
	/Users/autonomous/projects/incognito-chain/metadata/issuingethrequest.go:166
github.com/incognitochain/incognito-chain/blockchain.CreateShardInstructionsFromTransactionAndInstruction
	/Users/autonomous/projects/incognito-chain/blockchain/shardproducer.go:890
github.com/incognitochain/incognito-chain/blockchain.(*BlockChain).verifyPreProcessingShardBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardprocess.go:369
github.com/incognitochain/incognito-chain/blockchain.(*BlockChain).InsertShardBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardprocess.go:190
github.com/incognitochain/incognito-chain/blockchain.(*ShardChain).InsertBlock
	/Users/autonomous/projects/incognito-chain/blockchain/shardchain.go:258
github.com/incognitochain/incognito-chain/syncker.InsertBatchBlock
	/Users/autonomous/projects/incognito-chain/syncker/utils.go:105
github.com/incognitochain/incognito-chain/syncker.(*ShardSyncProcess).streamFromPeer
	/Users/autonomous/projects/incognito-chain/syncker/shardsyncprocess.go:300
github.com/incognitochain/incognito-chain/syncker.(*ShardSyncProcess).syncShardProcess
	/Users/autonomous/projects/incognito-chain/syncker/shardsyncprocess.go:204
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1357
2021-05-25 21:50:29.643 shardchain.go:260 [ERR] BlockChain log: -1051: Instruction Hash Error 
 Expect instruction hash to be 5d809a2ce79370c64f06dc8bb8ddb82335367d447474d36790cfb13828478a1f but get 0000000000000000000000000000000000000000000000000000000000000000 at block 433078 hash db4f7127444d9425f0789d7e4e6b63102b5c31f69c824d25af5491d2897874ea
Instruction Hash Error

:frowning_face:

Update:
I tried rolling back a couple of versions, getting the same error in 20210516_1, 20210514_2, 20210514_1 and then I tried jumping all the way back to 20210302_1 and it moved past the block. So I switched back to 20210516_2 and continued syncing.

Screen Shot 2021-05-26 at 00.32.19

Update 2:
It did not last long, after a while it hit another block with the exact same error and stalled at 451807. This time I tried every single release back to 20210302_1 and they all gave the same error. So in short, from 20210313_3 and forward, I keep getting the invalid character 'i' looking for beginning of value error. I have not looked into the code, so I am not sure if that just means that older versions just let the error through, or if the error was introduced in March. I am gonna try letting it sync up to the modern blocks before switching to the new version. We’ll see what happens.

2 Likes

I appreciate all of your investigation!!!

@Support do you have any additional information or troubleshooting suggestions?

1 Like

Right, might add that running 20210302_1 up to block 1M and then switching over to the latest release works fine. Shard synced all the way up to current block.

I am now running a new test with a full node to check all shards. I have made a new fresh install and running the latest recommended script (How to setup your own node in a blink of an eye). It’s not done yet, but I can tell you it’s not looking good so far. I have blocks with errors on multiple shards. I’ll make a post when all shards are done or stalled.

For those keeping score at home, I am still unable to get shards 0,2 or 6 to sync fully on a validator. I see there’s a new tag 20210622_1 which I’m trying now

Well … at least whatever has been updated fixed the slow sync issue I’ve been (casually) observing for about a week. After an update last week (20210617_1?), one of my pNodes slowed to a crawl on beacon/shard syncing. The pNode was literally in the middle of a sync and saw sync speed instantly drop by ~75%. Was only syncing about ~250,000 blocks per day, if that.

Then whatever change was pushed yesterday broke all my other pNodes, similar to what Devenus observed.

The update today (20210622_1) has restored syncing at a reasonable rate again. The pNode that suddenly couldn’t sync more than ~250,000 in a day, is already up to blockheight ~450,000 in a few hours. Last week that took nearly two full days.

Hopefully the beacon chain syncs will be caught up by tomorrow and I’ll be syncing assigned shard chains thereafter.

3 Likes

Or not.

So far today – one pNode has started resyncing from scratch … again. Another one has been stalled near the current blockheight for nearly an hour, and is now reporting offline in the Node Monitor. I expect it too will start resyncing from scratch – again – shortly. <SIGH>

update: Yep, the stalled one started over AGAIN.

So at least two nodes started a resync from 0 yesterday, synced up to the current blockheight, then inexplicably stalled near the current blockheight and have now started yet another resync from 0 in a ~24-hour period. RIP monthly ISP bandwidth cap.

On the new image and shard 0, I stalled earlier than normal at block 63902

VERY glad that they didn’t implement slashing yet. Any thoughts @support?

1 Like

:wave:t6:
You are not one) my two vnodes also hung up at block 63902
:cold_sweat:

1 Like

Hey all,

I want to share my experience here. I stopped all of my Incognito dockers, and followed 3rd (infura account) and 4th (run.sh script) steps here (How to host a Virtual Node). My vNodes run flawlessly (no stall, no offline) for at least 3 days.

Btw, run.sh may be wrong. Please fix it as it is written here: How to host a Virtual Node

1 Like

@Josh_Hamon @zes333 My pNodes finally resynced the beacon chain (third time’s the charm, I guess) and have started syncing Shard 0. The Shard 0 blockheight for each is currently above 900,000.

These two are each on 20210622_1: image

image

image

@abduraman Didn’t need to make changes to scripts or config parameters (not that I could even if I wanted to – these are pNodes).

¯\_(ツ)_/¯

2 Likes

@Mike_Wagner I agree with you. My experience sharing was not an answer to the concerns about pNodes above. I wrote here since the topic title writes “… Multiple vNodes”.

2 Likes

Unfortunately, I don’t use such a script. I’ve created another script for each node.

1 Like

If only I knew why

1 Like

You have been able to fully sync shard 0, 2 & 6?

I am using an infura account, but only 3 calls have been made to it.

Per @fredlee that’s not required, but to confirm I’ve asked @rocky in his setup post.

1 Like