Skip to main content

What are Merkle Trees and why are they important for blockchain?

What are merkle trees?
Let's start with the basics by brainstorming a solution to this problem:
How can you compare two copies of a large file across two machines?
A naive approach would be to send the file from one to the other machine, and compare the contents using a sting matching algorithm.

Let's do a little better with the understanding that utilizing CPU time locally is cheaper than sending something over the network.

A better approach would then be to generate a hash value (i.e., a checksum) from each of the two files and compare the results. This approach starts making even more sense as you increase the number of machines on which the file is replicated and comparison needs to be performed--each machine can generate a checksum and share the results.

Now, let's ask another question: "When you find differences in the hashes generated for file comparison, how can you quickly find out which part of the file is different?"

This is more challenging because a checksum/ simple hash value is generated for the entire file and not certain parts of it, and you cannot find which part of the file was different by just looking at the checksum. That's where Merkle trees help.

Instead of generating just one hash value, you can split the file into multiple chunks and generate hash value for each chunk separately. And to make the process efficient, instead of sharing all the hash values, you generate a hash of two hash values, and keep building the tree. The final, single hash value is for the entire file and can be shared. When differences are found, you can compare the hash values at the next level (and so on), till you identify the block which was different.


But what does this have to do with blockchain?
A public blockchain network could have thousands of computers. How do you check and ensure the integrity of the blockchain across all these computers? Do you share the hash values or the entire chain for comparison? And how do you find out which part of the chain had a mismatch? How can you quickly rebuild the hash after changing some part of the data?

Merkle trees are your friends.

Comments

Popular posts from this blog

So, is Ether = Bitcoin and Ethereum = Bitcoin blockchain?

Is Ethereum also a cryptocurrency? Well, formally speaking, Ethereum is a blockchain based distributed platform, and Ether is the currency used on that blockchain. Is Ether same as Bitcoin? Just like Bitcoin, Ethereum blockchain also needs mining (i.e., proof-of-work and thus, computational resources to mine the blocks). Miners are rewarded with Ether. But Ethereum also has the concept of smart contracts and an associated virtual machine to execute the smart contracts. Besides rewarding miners with Ether, computational resources consumed by the code executed on Ethereum Virtual Machine is controlled by spending Ether , which is termed as gas . I understand the Ethereum blockchain supports smart contracts, but is Ether similar to Bitcoin? The answer to " why we need a cryptocurrency? " is different for the two blockchains. For Bitcoin, the cryptocurrency is the reason why blockchain exists; i.e., the sole purpose of the blockchain is to store bitcoins. For ...

Blockchain 101 for Developers - Part 2 - Smart Contracts (concepts)

I have heard of Bitcoin but what is Ethereum? While bitcoin (traditionally) allows simple addition and subtraction operations, which are sufficient for exchange of cryptocurrency, Ethereum enables more sophisticated operations. Ethereum positions itself as a programmable blockchain . Ethereum takes the blockchain concept popularised by bitcoin and introduces a full Turing-complete virtual machine on top of it. The machine is capable of executing code, which is usually written in Solidity programming language. But what does it do? Most commonly, the programmable code is termed as a smart contract ; it establishes conditions under which things of value are exchanged. In other words, instead of using a traditional legal framework to bind two parties, there is code deployed within the blockchain which automatically enforces an agreed contract. The contract could be as basic as "money is transferred from an escrow account to Alice's account, only when today is a Sunday...

Blockchain 101 for Developers - Part 1 - Concepts

What problem does Blockchain address? (In very simple terms) Blockchain primarily addresses aspects of record keeping across multiple, distributed peers such that trust between them is no longer an issue and no central authority is needed either. Somewhat more formally, blockchain is an immutable, distributed ledger of transactions . In today’s world, each organization/ peer participating in a business network performs its own bookkeeping of facts of interest. For example, a car manufacturer would have its own data related to the cars manufactured and a car distributor/ reseller will have its own data---even through there are overlaps and they might be talking about the same asset (i.e., a specific car). Same goes for financial institutions---each bank will have its own data. Such silos definitely need either central authorities/ middle-man who everyone can trust as well as settlements and reconciliation processes to iron out conflicts. Blockchain addresses all these problems b...