What privacy problem are you solving?
Incognito Chain is consuming a lot of storage so far (over 100GB for 4 months). At this rate, the typical physical node device (500GB SSD) would run out of storage in about a year and a half, and virtual node expenses would increase substantially. It is thus essential to reduce the current size of Incognito Chain by by at least 50% (to about 50GB at the time of writing)
The Incognito Chain has grown up extremely fast. After diving into the source code and architecture design, we identified a lot of duplicated and outdated data.
At the moment, Incognito Chain stores all information with simple key-value, an approach taken by blockchains such as Bitcoin and Litecoin. Incognito Chain loads all state data (data after processed block) into RAM and processes a new block. As state data (data after processed block) gets bigger, this process will eventually become impossible to conduct. Incognito is an UTXO-based blockchain, but its privacy features make data much bigger than Bitcoin or Litecoin. A new database design is essential. Here is a summary of problems with the current database design:
- Lack of ability to support incognito’s consensus.
- Consumes a lot of storage.
- Unable to handle state data in forked situation.
- No atomicity or rollback.
What is the solution?
We found Ethereum’s approach to database design handy. It allows us to:
- Assemble with Incognito’s consensus
- Easily handle a forked situation
- Provide rollback ability and atomicity
- Reduce storage consumption
This approach has been battle-tested and proven effective over almost 6 years, and will save us a great deal of R&D time at this critical juncture.
Which solutions do people resort to because this doesn’t exist yet?
Here are a few other solutions we considered and found wanting:
- Process state data after blocks are finalized. This would harm UX and cause large delays.
- Organize key-value storage schema to handle forked situation, which seems like a good solution, but perhaps overly sophisticated to handle at a low level and get rid of out-dated data.
Who are you?
- I’m @hungngo from the Incognito Core team. I’ve been a researcher in the blockchain space for some time. I find many new emerging blockchains interesting, but the bulk of my work so far (apart from Incognito!) is in Ethereum and Bitcoin. I’ve been with Incognito for the last 15 months building out its consensus. I spend most of my time working on the database.
Why do you care?
- Reducing the size of the Incognito Chain is essential in growing our validator community to achieve decentralization. The lower the costs, the lower the requirements, the better.
- A good database design is integral for incognito’s consensus to continue developing.
What’s your plan? What’s your schedule?
Development has been ongoing for a while (since 15 Nov 2019). Here is what I’ve achieved so far, and my plan going forward:
Step | Task | ETA (days) | Actual Begin Date | Actual End Date |
---|---|---|---|---|
1 | Research and comprehend Ethereum database design | 15 | 15 Nov 2019 | 30 Nov 2019 |
2 | Build prototype according to incognito chain database schema | 7 | 1 Dec 2019 | 7 Dev 2019 |
3 | Build Incognito class diagram, link. | 15 | 8 Dec 2019 | 24 Dec 2019 |
4 | Implementation and unit testing | 30 | 25 Dec 2019 | 21 Jan 2020 |
5 | Integration with consensus v1 | 30 | 3 Feb 2020 | 8 Mar 2020 |
6.1 | Review And Testing with consensus v1 | 60 | 10 Feb 2020 | 27 Mar 2020 |
6.2 | Deploy new database with consensus v1 | 21 | 21 March 2020 | estimated 15 Apr 2020 |
7 | Integration with new code based | 7 | 21 Feb 2020 | 26 Feb 2020 |
Blockchain Size would be reduced after Step 6 is completed. Step 7, 8 can run with 6 previous steps at the same time (develop for both version simultaneously)
What’s your budget?
Resource | Cost | Quantity | Monthly Cost |
---|---|---|---|
Incognito Protocol Engineer @hungngo (1/2 resource of this job) | 1,000 PRV | 1 | 500 PRV |
TOTAL (x 5 months) | 2500 PRV |
Is there an existing conversation around this idea?
Our validators have been especially concerned about this recently – both physical node owners and virtual node operators. These proposed actions will improve the validator experience.
Is there anything else you would like the community to know?
If you have an experience in database design for blockchain products, I would love to hear your thoughts.