Bind for 0.0.0.0:94xx failed: port is already allocated

brico84 · 20 November 2023 17:16

Hello community. I have been running node validators for years now. The last few months I have started to get weird errors on some of my nodes. I run the hardLinks script and the node controller script, so my nodes stop and start a few times a day. If you never stop your containers, then likely this error would never come up. I am posting here in case anyone else is having this issue and we can hopefully come up with a solution.

This has now infected 3 of my nodes in the last month. I have tried the following to resolve the issue properly with no avail…

check if the port is in use with lsof. It is not in use, shows nothing.
2)remove the container, remove the data folder, prune, re-start docker service, re-add container, same error.

In the end, as a workaround, I was able to get these nodes back up and running by changing the port used in the add process to a port outside the increment of the blink script. This works… but is a workaround, not a solution, as this issue keeps affecting new nodes of mine.

Would be great if I could get to the root of the cause.

Has anyone seen this kind of error before?

The specific verbage is …

“Error response from daemon: driver failed programming external connectivity on endpoint inc_mainnet_0 (99ysadef90324nffsad): Bind for 0.0.0.0:XXXX failed: port is already allocated”.

Where XXXX is the port in-question

brico84 · 15 February 2024 01:51

Has anyone else run in to this or am I the only one? This has now affected 15 of my nodes and is becoming a big hassle to deal with.

Hard to believe I am the only one who is experiencing this problem.

Jared · 15 February 2024 15:36

What are the outputs of the following:

Kill Lingering Processes:

Use netstat -tulpn and ps aux to meticulously find any remnants of previous node processes. Kill them forcefully if needed.

brico84 · 15 February 2024 17:46

Doesn’t resolve the issue. I use those commands to find the running processes, kill them, and the issue still persists

Jared · 16 February 2024 02:34

Okay let’s check some more things.

docker ps -a what’s the output of this? Do the old containers show here?

Check for zombie processes as well with ps aux | grep Z

brico84 · 21 March 2024 17:24

@Jared

The old container do not show with docker ps -a nor is there anything useful listed with ps aux | grep Z.

I am continuing to get new nodes that have this issue, i am up to some 20 nodes that are having this problem now. Any other ideas?