@Support, I have noticed that if a node is healthy and online, and on latest, it generally stays that way barring an event that would cause an issue. However if the docker container is stopped, the node doesn’t always stay healthy when coming back online. For instance on a reboot, or code upgrade, or anythign of the sort. 5 - 10% of my nodes when being started again will not properly come back up. Sometimes they just stall, and with a stop and start the issue is fixed, not as big a problem. However sometimes they will not come back online. When checking docker ps -a I can see that the container is infinitely restarting.
The only fix for this in the end is too deleted the shards (sometimes beacon too but usually not) and then bootstrapping the node.
This has become quite frustrating. I confirmed with @Jared recently that I am not alone in this issue.
Is anyone else having this problem?
Is there something that can be done to get more stability?
It seems strange to me that nodes have to be bootstrapped just because they have been stopped and started.
Would love a solution to this problem. I posted this on the forum instead of sending support a private message in case others are having this problem or know of a solution.