Merkle-zation World State

Status

NOT STARTED 

Stakeholders
Outcome
Due date
Owner

Background

Merklezation is a basic approach to proving the integrity and existence of data without having to replicate the entire data store. Merkle's Proofs are widely used across all decentralized networks, from SPV in Bitcoin to Patricia Merkle Trie of the World State in Ethereum.

Problem

When a light client requests data from a full node, it needs a way to check if the data has been modified by the responder.

Solution

Since the Light Client (web or mobile app) does not have the entire blockchain, it asks for proof of existence verified by a list of trusted validators. In current state, Ihora2 has predefined list of validators with known public keys, the client must trust the validators if it is using the network. Having a verified signature generated by the private key underlying the validator's public key should be enough to trust the hash.

Cryptographic proof

Just as it does now with proof of transactions, the hash of a block can contain proof of the state of the world. 

Figure #1: Merkle Tree of Transaction in Block (Bitcoin as a reference)

Merkletree.png


Figure #2: State root in block (Ethereum as a reference)

enter image description here


Constructing a Merkle tree from the key value database used in Iroha2 is not that difficult. Since any record has a unique key, which can be represented by a deterministic hash (for example, sha256) and used as a leaf of the tree.

Unlike a transaction tree, a state tree needs a stable structure (static keys) and dynamic proofs. Therefore, Ethereum, Substrate and other networks use a tree of keys and values, where the key is a static identifier of data or path, and the value is data or proof.

Figure #3: Example of KV Merkle Tree


Accordingly, in Figure 3, each leaf node has data (literally hash of data) and a static identifier - a key. At each step along the path to the root tree, a combined hash of child hashes is stored that immediately reflects any changes to the underlying data.

Cryptographic Proofs

The response to the request must contain your data, along with either cryptographic proofs, or with a corresponding error, if such data for some reason is absent from the blockchain.

Cryptographic proof is a response to a read request made through a light client that:

  1. Verifies the authenticity of the included data
  2. Certifies that the specified data is safely stored on the blockchain.

Validation of response should be done next steps:

  1. As soon as we get the data, we verify the block authenticity according to the downloaded/predefined set of keys of the validators
  2. Locally build hash of responded data and compare with hash stored in tree (provided as a proof)
  3. Validate each step of merkle tree up to root
  4. Validate integrity of provided state root to block hash

Decisions

  1. Measuring the difference in cost between self-implementation and using a third-party implementation
  2. Evaluate the value of having proofs of existence and integrity in responses for light clients

Concerns

  1. Partial data request will not allow building a data hash locally (could be solve with Tree of trees or prohibited by design)

Additional Information

ELI5: How does a Merkle Tree work?

Exonum MerkleDB

Implementation of the Base-16 Modified Merkle Tree ("Trie") by Paritytech