Then he may have some enemies Unless he hasn’t any exclusive ISP, the network is the only parameter he cannot control. I think I have found the problem: His network has gone bad
(Kind of solved) Problem starting multiple nodes at the same time
Well, that’s the thing. Up until recently, I was hosting 33 nodes on a single 2011 Mac mini, 16GB, 1TB SSD. But now with the new version and committee changes, I am facing performance issues and my nodes are getting bulk slashed.
I’ve been trying to find something stable, but have problems starting even 2 nodes at the same time. If I stagger the starts by 15 minutes, I manged to run 4 nodes, but when the 5th starts, everything stops and they all go offline.
I’m giving up on my Mac mini incognito project for now, it had a good run, for over a year it ran both a full node and was validating on all shards.
So now 33 of my nodes have no home, I can retire them, move them to another server, or build a new glorious multi-node project. Have not decided what to do with them yet.
Go for glory my friend. An epic multi node project. 33 nodes on a single 2011 Mac mini is pretty cool .
You want to allocate 2GB RAM per node. Could you upgrade the RAM?
When you start them up check the node monitor (monitor.incognito.org) first to make sure it says Sync Status Latest
before you start up the next node.
How do you manage so many on such a small storage space? Do you use hard links?
With the resource improvements the devs recently released there is still hope for your set up. Lets wait and give it a shot again.
I know it may depend how many are in committee. But the part I’m always uncertain on is the number of cores given X nodes on a system. If Fredlee was previously running 33 nodes on a 2011 mini mac suggests to me cpu / cores isn’t really an issue and the SSD and RAM are the ones to focus on.
The new node database mode being worked on should drastically help lower resource usage.
I already spoke to the dev in charge of this and we will try to offer a bootstrap version so node operators can quickly change over.
A bootstrap would be greatly appreciated
It’s unfortunately a hardware limitation on the 2011 model. Like you were hinting when we talked, it looks like the memory is the biggest issue, but it also goes hand in hand with storage reaching it’s limit. So basically, with the beacon and shard growth over the past year, I finally hit an upper limit what the hardware could handle.
Kind of, I use ZFS and snapshot clones. I also experimented with dedupe between nodes, but that does not work because each nodes database files ends up with unique data.
Looking forward to the new improvements. What I have noticed on the current database design, if I bootstrap from one node to another, when I then boot the new node, it rewrites about 30GB of data during the integrity check in the start. I imagine it updates some data and therefore rewrites the database pages? (just guessing here). Do you know if this is different in the new storage design? I imagine that if it can avoid updating or rewriting the big immutable block data, cloning or hard linking should work even better.
Yes. That is correct, during normal operation it’s very light on the cpu load because of the relatively slow rate of blocks. You really only strain the CPU during shard resyncing or node restarts.
Have you tried using this method for hard links? It was designed and coded by a community member:
I use it to manage a large range of nodes and it works well for me.
Nope, but I saw that one, checked the source and it looked really good.
Unfortunately I would not be able to use it because it would kill my nodes doing this.
console.group("\nStarting containers.");
for (const nodeIndex of allNodesIndex)
console.log(await docker(["container", "start", `inc_mainnet_${nodeIndex}`], (v) => v.split("\n")[0]));
I wonder tho, are the ldb files immutable? Is it 100% sure that old ldb files are never written to? Or is it more of a, “seems to work fine for me”?
I believe the old files are not written to. Just checked and confirmed they are there.
Can you try this @fredlee
For Validator, you can choose between Lite or Batch-commit
For Fullnode, you can choose Batch-commit or Archive
Testing this now on a new machine.
If I understand correctly on a fullnode with lite database, I will not be able to pull balances or create transactions in the end? But is it still ok to sync and use that as a bootstrap for validators?
While syncing in lite_ff mode I noticed it slowed down a lot around 1700000 blocks in, and it was eating a lot of memory (>4GB). Being a completist I decided to restart testing all modes in parallel.
1 hour into my new test I noticed that the new version seems to be faster than the 20220408_1 version. In all database modes. But it’s quite heavy on the memory, especially with ffstorage enabled.
CONTAINER ID MEM USAGE HEIGHT SPACE SPACE/BLOCK
inc_220408_1 397.9MiB 40453 1.1G 27kb
inc_arc 615.8MiB 63492 1.7G 27kb
inc_arc_ff 2.145GiB 61619 1.5G 24kb
inc_batch 1.252GiB 59191 917M 15kb
inc_batch_ff 2.692GiB 56996 734M 13kb
inc_lite 596.3MiB 66788 1.2G 18kb
inc_lite_ff 2.140GiB 63964 881M 14kb
I know you’ve always recommended 2GB per node before, but in reality they have used less than 500MiB when running as validators. I hope this is not going to 4x or more going forward.
Update, after 5 hours syncing:
CONTAINER ID MEM USAGE HEIGHT SPACE SPACE/BLOCK
inc_220408_1 868MiB 122697 3.5G 28kb
inc_arc 742MiB 204041 7.7G 38kb
inc_arc_ff 2.463GiB 196196 6.8G 35kb
inc_batch 2.318GiB 178799 4.0G 23kb
inc_batch_ff 4.17GiB 178611 3.3G 18kb
inc_lite 709MiB 213209 4.5G 21kb
inc_lite_ff 2.435GiB 203428 3.6G 18kb
The RAM usage should settle down quite a bit once they reach Sync State Latest
. I’ve been discussing with the devs about providing bootstrap data for these new node methods, so hopefully, we can provide that soon.
Thanks, we’ll see how it looks later in a day or two when it has synced everything.
This is after 22 hours
CONTAINER ID MEM USAGE HEIGHT SPACE SPACE/BLOCK
inc_220408_1 1.043GiB 380780 12G 32kb
inc_arc 955.8MiB 684379 32G 47kb
inc_arc_ff 2.85GiB 655377 27G 41kb
inc_batch 5.775GiB 611572 16G 26kb
inc_batch_ff 7.447GiB 596533 13G 22kb
inc_lite 913.9MiB 729802 18G 25kb
inc_lite_ff 2.736GiB 678622 14G 21kb
> 7GB, you sure it’s not leaking?
I’m firing up a new server now and going to run a beta node. I’ll collaborate with you after and we can compare numbers and data.
If I understand correctly on a fullnode with lite database, I will not be able to pull balances or create transactions in the end?
check my comment here
I sync a fullnode too and it took me 65GB to latest block.
My test node has been running for a while now. Roughly 65% of the way done. I’m seeing high usage of cache on the RAM. Occasional disk I/O bursts. I’m curious how much RAM usage will drop down when Sync State reaches Latest
.
Side note:
The devs are still working on providing bootstrap files for these new database modes, although we should have them ready before these go live.
MODE | MEM USAGE | MEM CACHED | HEIGHT | SPACE |
---|---|---|---|---|
lite_ff | 1.71 GiB | 14.3 GiB | 1,269,169 | 21.9 GiB |
After this node finishes syncing I’m going to test out a node with batch_ff
to see if I get the same results as you @fredlee.
Update:
My high cache could have been due to low disk IO. This was a new server and I had to call my host to have them increase speeds to SSD.
Are you also noticing a substantial slowdown at higher block height? Today my nodes pretty much did 100k blocks in 14 hours.
CONTAINER ID MEM USAGE HEIGHT SPACE SPACE/BLOCK
inc_220408_1 1.245GiB 825386 35G 42kb
inc_arc 1.625GiB 1191382 90G 76kb
inc_arc_ff 3.236GiB 1202844 84G 70kb
inc_batch 6.466GiB 1172576 39G 33kb
inc_batch_ff 8.569GiB 1172674 33G 28kb
inc_lite 1.866GiB 1283860 43G 33kb
inc_lite_ff 3.611GiB 1291952 36G 28kb
Our used space differ quite a bit on lite_ff. You’re only syncing a single shard right?
Ah, I should have mentioned. This node is not yet staked so it does not have a shard assigned. I bring new nodes online with the anticipation of staking a new one.