Offline node w/ 49G log file, that doesn't seem right

One of my nodes showed as offline in the app, and I saw the disk was full and neither mainnet nor ethereum/client-go were running.

In the /data0 folder there were three log files:

  • “-2021-03-14.log”
  • “-2021-03-15.log”
  • “-2021-03-16.log”

3/14’s file size was 18G, 3/15’s file size is 49G and 3/16’s is 9.4G.

The 49G is killing me, can I dump it?

Update, I have updated it to the newest tag 20210320_2 but it still shows as offline. Telnet admit also won’t connect to it. I have disabled the firewall but no effect.

@khanhj could you please help check this issue?

Hi @Josh_Hamon,

Please stop all running container, kill the run.sh process,

khanhlh@staking-khanhle:~$ sudo docker stop $(sudo docker ps -aq)
khanhlh@staking-khanhle:~$ ps -ef | grep run.sh
khanhlh  27285     1  0 09:27 pts/37   00:00:00 bash run.sh
khanhlh  27725 21884  0 09:28 pts/37   00:00:00 grep --color=auto run.sh
khanhlh@staking-khanhle:~$ kill -9 27285

then
You can delete these log files in /data0 folder.

After that, start your run.sh again.

Thanks, when I ps -ef | grep run.sh I get back only one line:
root 7996 7978 0 14:12 pts/0 00:00:00 grep --color=auto run.sh

So I wasn’t able to kill the process. I am using the multi-vnode script and so I grep incognito.sh and got back 4 lines but no killable processes.

I removed the logs and restarted docker anyway but no change in the app or telnet admin.

How quickly will the app show it’s fixed once we do? I imagine that every-time I tap Power in the app it checks the nodes?

Hi @Josh_Hamon

check docker ps to see if your node running status
tail -f data0/log_file_name.log to see if it sync beacon block properly;

the wallet app should show the status everytime you click on Power

nodes running per docker ps

data0/log_files:
-2021-03-18.log -2021-03-19.log 207.31-2021-03-20.log 207.31-2021-03-21.log error.log

tail -f ./-2021-03-18.log

syncker: receive beacon block 1083226 
syncker: receive beacon block 1083226 
syncker: receive beacon block 1083226 
NumCPU 4
2021-03-18 09:18:05.040 incognito.go:90 [INF] Server log: Version 1.19.0-beta
2021-03-18 09:18:29.373 engine.go:150 [INF] Consensus log: CONSENSUS: NewConsensusEngine
encoded libp2p key: CAMSeTB3AgEBBCBcRdHZo0meGsLkcg5/TWixexdMZ2l7kSKLFjp93LSIqaAKBggqhkjOPQMBB6FEA0IABNmTVHIlfpSp7EmBgHK7+pBBbfp0d61IU3IbJIoqbf2NsRfAdz8nbepoNzub7nDWLq+QO1G5XRroqPAhBFaywg8=
2021-03-18 09:18:29.467 host.go:88 [INF] Peerv2 log: selfPeer: QmX3ffiGh7pmJ37QBtNNUnxce6ph3qGXtdZGmzkNaowmnZ 0.0.0.0 9433```
tail -f ./-2021-03-19.log
2021-03-19 06:07:56.849 blockrequester.go:215 [INF] Peerv2 log: [stream] Requesting stream block type BlkBc, spec false, height [1074501..1085100] len 2, from 255 to 255, uuid = b2b91e71-49b0-4c70-a3d0-463ca7580da7
syncker: receive beacon block 1085102 
syncker: receive beacon block 1085102 
syncker: receive beacon block 1085102 
NumCPU 4
2021-03-19 06:08:26.375 incognito.go:90 [INF] Server log: Version 1.19.0-beta
2021-03-19 06:08:42.920 engine.go:150 [INF] Consensus log: CONSENSUS: NewConsensusEngine
encoded libp2p key: CAMSeTB3AgEBBCBMU92lQCzsdWm8180sebYtg+zUkJIhSCnN8TWMU8zvY6AKBggqhkjOPQMBB6FEA0IABH1lemI2EepbNtJMa61j1w1smHNoBRmqVbmAwSBVtUcEVVZV+sKAQmfaAwHHMd+Pd9al/LeVU0AzRix5rGQUwpo=
2021-03-19 06:08:43.035 host.go:88 [INF] Peerv2 log: selfPeer: QmeNXJuus3M6U9KRrXnjBAzDD3Ui1QPPBRV6gAQBkWGDv6 0.0.0.0 9433```
tail -f ./207.31-2021-03-20.log
2021-03-20 03:19:46.151 incognito.go:96 [INF] Server log: Version 1.19.0-beta
2021-03-20 03:20:05.451 engine.go:174 [INF] Consensus log: CONSENSUS: NewConsensusEngine
encoded libp2p key: CAMSeTB3AgEBBCAg/YpwJ8x+MAmjQK4GoCgqa2KaYW/E4jiSEX8mI224z6AKBggqhkjOPQMBB6FEA0IABGiJyJEGM1u9LNILUk0pIMI0NkESPZa6qalpcbxZ7s+SfhENGhUOE3onK5XBOVB8fB8KpdiY3+ObB4pgcPsVjBU=
2021-03-20 03:20:05.826 host.go:88 [INF] Peerv2 log: selfPeer: Qmafu8Fw2oZ4E3oLkagYEaos72TmFch7qLtV1vdcNNvadJ 0.0.0.0 9433
NumCPU 4
2021-03-20 21:23:26.589 incognito.go:96 [INF] Server log: Version 1.19.0-beta
2021-03-20 21:23:44.134 engine.go:174 [INF] Consensus log: CONSENSUS: NewConsensusEngine
encoded libp2p key: CAMSeTB3AgEBBCA26Dg+k8R+Qo300VzW7zkN7b6EznuZgqFgBlSqaWi9UqAKBggqhkjOPQMBB6FEA0IABN0UDPCOxRubYBdYnuTjdgUn1dhxIRR+qvl5YdzsQAlm4+w8ps9zMruQlsXg2qSJWdvH6zT2aBruVTvJ2LQsa78=
2021-03-20 21:23:44.336 host.go:88 [INF] Peerv2 log: selfPeer: QmaSKtMD8UHm34VAJaLtNJYQmqMhQLYFM5prAUzyHtmzK7 0.0.0.0 9433
tail -f ./207.31-2021-03-21.log
2021-03-21 17:20:00.353 incognito.go:96 [INF] Server log: Version 1.19.0-beta
2021-03-21 17:20:20.639 engine.go:174 [INF] Consensus log: CONSENSUS: NewConsensusEngine
encoded libp2p key: CAMSeTB3AgEBBCAQikbOCtJVtOMT3ZZbVellBz2C++Tjy3yz/Rrq1PZyR6AKBggqhkjOPQMBB6FEA0IABAUHAWy3V9XrBOriW9rAeTqYGWpP10yTjpQjZ43gn18kqhIGrzRM0TfDgIGUX3ADVqfzHaY1wfXBnz2PcWhCjX8=
2021-03-21 17:20:20.743 host.go:88 [INF] Peerv2 log: selfPeer: QmT47VJBBfePXkLXf4uxVKcgobsoCvCqij2pMsk5Yqh4To 0.0.0.0 9433

tail -f ./error.log is blank

Ports 9334 and 9935 shows actively blocked per telnet admin, but 9336 connects. Incognito app also shows the first two as offline and the third as ok.

ufw status
Status: active
Logging: on (high)
Default: deny (incoming), allow (outgoing), deny (routed)
New profiles: skip

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW IN    Anywhere                  
9334/tcp                   ALLOW IN    Anywhere                  
9335/tcp                   ALLOW IN    Anywhere                  
9336/tcp                   ALLOW IN    Anywhere                  
8545,9433/tcp              ALLOW IN    Anywhere                  
9434/tcp                   ALLOW IN    Anywhere                  
9435/tcp                   ALLOW IN    Anywhere                  
8545/tcp                   ALLOW IN    Anywhere                  
8546/tcp                   ALLOW IN    Anywhere                  
9433/tcp                   ALLOW IN    Anywhere                  
30303/udp                  ALLOW IN    Anywhere                  
22/tcp (v6)                ALLOW IN    Anywhere (v6)             
9334/tcp (v6)              ALLOW IN    Anywhere (v6)             
9335/tcp (v6)              ALLOW IN    Anywhere (v6)             
9336/tcp (v6)              ALLOW IN    Anywhere (v6)             
8545,9433/tcp (v6)         ALLOW IN    Anywhere (v6)             
9434/tcp (v6)              ALLOW IN    Anywhere (v6)             
9435/tcp (v6)              ALLOW IN    Anywhere (v6)             
8545/tcp (v6)              ALLOW IN    Anywhere (v6)             
8546/tcp (v6)              ALLOW IN    Anywhere (v6)             
9433/tcp (v6)              ALLOW IN    Anywhere (v6)             
30303/udp (v6)             ALLOW IN    Anywhere (v6)   

@Josh_Hamon,

Can you show me your run_incognito.sh file?
somehow the output of docker ps is missing in your reply above.

if you try to run multiple validator on same machine:

  • make sure your hardware specs has enough capability
  • each container is sync and send vote for each validator, therefore, they create separate data folder and logs
  • make sure you disable firewall rule on your vps admin console
1 Like
output of ```sudo docker ps```

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3fd273b56860 incognitochain/incognito-mainnet:20210320_2 “/bin/bash run_incog…” 11 hours ago Up 11 hours 0.0.0.0:9336->9336/tcp, 0.0.0.0:9435->9435/tcp inc_mainnet_2
f30082125795 incognitochain/incognito-mainnet:20210320_2 “/bin/bash run_incog…” 11 hours ago Up 11 hours 0.0.0.0:9335->9335/tcp, 0.0.0.0:9434->9434/tcp inc_mainnet_1
a43f9ad39d88 incognitochain/incognito-mainnet:20210320_2 “/bin/bash run_incog…” 11 hours ago Up 11 hours 0.0.0.0:9334->9334/tcp, 0.0.0.0:9433->9433/tcp inc_mainnet_0
9266d4e9b39e ethereum/client-go “geth --syncmode lig…” 3 months ago Up 11 hours 0.0.0.0:8545->8545/tcp, 0.0.0.0:30303->30303/tcp, 8546/tcp, 30303/udp eth_mainnet

  • make sure your hardware specs has enough capability
    ** is that officially outlined? I think have the correct specs, because this is a new problem and not one that says “if X validators on one machine, then Y,Z,AA specs are required.”
  • each container is sync and send vote for each validator, therefore, they create separate data folder and logs
    ** each container does have separate data (data0, data1, data2) and log files.
  • make sure you disable firewall rule on your vps admin console
    ** I have verified with my provider that nothing is firewalled by default. Thus far ufw being enabled or disabled has had no effect.

Thank you for helping me figure this out, please let me know what you advice.

incognito_eth.sh

#!/bin/bash

validator_keys=(
“key1”
“key2”
“key3”
)

run()
{
bootnode=“mainnet-bootnode.incognito.org:9330
latest_tag=$1
current_tag=$2

for i in “${!validator_keys[@]}”
do
docker rm -f “inc_mainnet_$i”
done

if [ “$current_tag” != “” ]
then
docker image rm -f incognitochain/incognito-mainnet:${current_tag}
fi

if [ ! -d “$PWD/eth-mainnet-data” ]
then
mkdir $PWD/eth-mainnet-data
chmod -R 777 $PWD/eth-mainnet-data
fi

docker pull incognitochain/incognito-mainnet:${latest_tag}

docker network create --driver bridge inc_net || true

docker run --restart=always --net inc_net -d --name eth_mainnet -p 8545:8545 -p 30303:30303 -v $PWD/eth-mainnet-data:/geth -it ethereum/client-go --syncmode light --datadir /geth --rpcaddr 0.0.0.0 --rpcport 8545 --rpc --rpccorsdomain “*”

for i in “${!validator_keys[@]}”
do
docker run --restart=always -p $((9334 + $i)):$((9334 + $i)) -p $((9433 + $i)):$((9433 + $i)) --net inc_net -e NODE_PORT=$((9433 + $i)) -e RPC_PORT=$((9334 + $i)) -e BOOTNODE_IP=$bootnode -e GETH_NAME=eth_mainnet -e MININGKEY=${validator_keys[$i]} -e TESTNET=false -e LIMIT_FEE=1 -v $PWD/data${i}:/data -itd --name inc_mainnet_${i} incognitochain/incognito-mainnet:${latest_tag}
done
}

if [ -x “$(command -v docker)” ]; then
echo “Docker Installed”
else
echo “Installing Docker”
bash -c “wget -qO- https://get.docker.com/ | sh”
sudo usermod -aG docker $USER
echo “PLEASE RESTART YOUR COMPUTER AND RE-RUN THIS SCRIPT”
exit
fi

ps aux | grep ‘incognito.sh’ | awk ‘{ print $2}’ | grep -v “^$$$” | xargs kill -9

current_latest_tag=""
while [ 1 = 1 ]
do
tags=curl -X GET https://registry.hub.docker.com/v1/repositories/incognitochain/incognito-mainnet/tags | sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | tr '}' '\n' | awk -F: '{print $3}' | sed -e 's/\n/;/g'

sorted_tags=($(echo ${tags[*]}| tr " " “\n” | sort -rn))
latest_tag=${sorted_tags[0]}

if [ “$current_latest_tag” != “$latest_tag” ]
then
run $latest_tag $current_latest_tag
current_latest_tag=$latest_tag
fi

for i in “${!validator_keys[@]}”
do
docker start “inc_mainnet_$i”
done

sleep 3600s

done

Hey @Josh_Hamon,

I confirm that this is an issue of the recent incognito update.
My team is still investigating the root cause. In the means time, you can try to delete all data and re-run your run.sh

2 Likes

Any update? I’m seeing others nodes going on/offline. I’m hanging tight in the meantime.

did you check your nodes to see if the CPU usage was abnormally high? I

I had a problem with my vNodes as well, couldn’t access any of them with RPC commands, and CPU and Disk I/O were way way higher than normal… I was on an older image, and had to manually force the update to latest version, but that did seem to return my nodes to normal syncing and resource usage.

@Josh_Hamon,
please check my response here

Some are showing that, I haven’t checked the image.