For background there is a design document relating to Bonsai DB in the wiki here
The issue for Bonsai archive is here and the PR is here. For context there was originally a draft PR which was a start at delivering the feature, and a re-based branch of that PR https://github.com/jframe/besu/tree/multi-version-flat-db-rebase
Design
Bonsai makes use of additional DB segments to store flat data in addition to the state trie. These segments include:
- ACCOUNT_INFO_STATE
- Key = hash of account
- Value = RLP encoded account state
- ACCOUNT_STORAGE_STORAGE
- Key = hash of account + hash of storage slot key
- Value = RLP encoded storage value
- CODE_STORAGE
- Key = hash of code or hash or account (based on--Xbonsai-code-using-code-hash-enabled config option)
- Value = the code
These are used to store account state, account storage, and code respectively.
In order to provide access to historic state and account storage, 2 new DB segments have been introduced:
- ACCOUNT_INFO_STATE_ARCHIVE
- Key = hash of account + block number
- ACCOUNT_STORAGE_ARCHIVE
- Key = hash of account + hash of storage key + block number
Here is an example ldb
query of the DB for a specific account at a specific block:
ubuntu@localhost:~$ ldb --db=. get --key_hex --value_hex --column_family=ACCOUNT_INFO_STATE_ARCHIVE 0x3e26253c5a8b02bb30c92798fa8083f0c966a954b019cad73aa12cdf3b344a7e00000000000113a1
0xF84D0389171FA691DC31EF9AA8A056E81F171BCC55A6FF8345E692C0F86E5B48E01B996CADC001622FB5E363B421A0C5D2460186F7233C927E7DB2DCC703C0E500B653CA82273B7BFAD8045D85A470
Configuration
Bonsai archive is delivered as a new data storage format. Initially it will be experimental. To create a Bonsai archive node use --data-storage-format=X_BONSAI_ARCHIVE
Rocks DB column families
Currently the column families are identified by an individual byte. You need to know that e.g. 0x0a maps to "TRIE_LOG_STORAGE". The Bonsai archive work might be a good time to update the column families from bytes to string names to make querying the DB directly easier. This would likely mean moving the DATABASE_METADATA version from 2
to 3
and would require decisions about whether to upgrade existing DBs or just use the new names for new DBs.