What are Merkle Trees and why are they important for blockchain?

What are merkle trees?
Let's start with the basics by brainstorming a solution to this problem:

How can you compare two copies of a large file across two machines?

A naive approach would be to send the file from one to the other machine, and compare the contents using a sting matching algorithm.

Let's do a little better with the understanding that utilizing CPU time locally is cheaper than sending something over the network.

A better approach would then be to generate a hash value (i.e., a checksum) from each of the two files and compare the results. This approach starts making even more sense as you increase the number of machines on which the file is replicated and comparison needs to be performed--each machine can generate a checksum and share the results.

Now, let's ask another question: "When you find differences in the hashes generated for file comparison, how can you quickly find out which part of the file is different?"

This is more challenging because a checksum/ simple hash value is generated for the entire file and not certain parts of it, and you cannot find which part of the file was different by just looking at the checksum. That's where Merkle trees help.

Instead of generating just one hash value, you can split the file into multiple chunks and generate hash value for each chunk separately. And to make the process efficient, instead of sharing all the hash values, you generate a hash of two hash values, and keep building the tree. The final, single hash value is for the entire file and can be shared. When differences are found, you can compare the hash values at the next level (and so on), till you identify the block which was different.

But what does this have to do with blockchain?
A public blockchain network could have thousands of computers. How do you check and ensure the integrity of the blockchain across all these computers? Do you share the hash values or the entire chain for comparison? And how do you find out which part of the chain had a mismatch? How can you quickly rebuild the hash after changing some part of the data?

Merkle trees are your friends.

Comments

So, is Ether = Bitcoin and Ethereum = Bitcoin blockchain?

Is Ethereum also a cryptocurrency? Well, formally speaking, Ethereum is a blockchain based distributed platform, and Ether is the currency used on that blockchain. Is Ether same as Bitcoin? Just like Bitcoin, Ethereum blockchain also needs mining (i.e., proof-of-work and thus, computational resources to mine the blocks). Miners are rewarded with Ether. But Ethereum also has the concept of smart contracts and an associated virtual machine to execute the smart contracts. Besides rewarding miners with Ether, computational resources consumed by the code executed on Ethereum Virtual Machine is controlled by spending Ether , which is termed as gas . I understand the Ethereum blockchain supports smart contracts, but is Ether similar to Bitcoin? The answer to " why we need a cryptocurrency? " is different for the two blockchains. For Bitcoin, the cryptocurrency is the reason why blockchain exists; i.e., the sole purpose of the blockchain is to store bitcoins. For ...

Blockchain 101 for Developers - Part 2 - Smart Contracts (concepts)

I have heard of Bitcoin but what is Ethereum? While bitcoin (traditionally) allows simple addition and subtraction operations, which are sufficient for exchange of cryptocurrency, Ethereum enables more sophisticated operations. Ethereum positions itself as a programmable blockchain . Ethereum takes the blockchain concept popularised by bitcoin and introduces a full Turing-complete virtual machine on top of it. The machine is capable of executing code, which is usually written in Solidity programming language. But what does it do? Most commonly, the programmable code is termed as a smart contract ; it establishes conditions under which things of value are exchanged. In other words, instead of using a traditional legal framework to bind two parties, there is code deployed within the blockchain which automatically enforces an agreed contract. The contract could be as basic as "money is transferred from an escrow account to Alice's account, only when today is a Sunday...

Best way to start learning Smart Contract programming

What's the best toolset to learn smart contract programming? The main thing is which blockchain you are targeting, but if you don't have a preference there already, Ethereum should be the default choice. Not only it's the oldest (and in my opinion mature as compared to others), there is a lot of traction in the industry around Ethereum. See, for example, the members directory of Enterprise Ethereum Alliance . So, if you are choosing Ethereum, a nice way to start programming for the Ethereum blockchain is learn Solidity. One nice thing about Solidity is the browser based Remix IDE . There are other ways, but Remix is preferred as the first exposure to Solidity and smart contracts. Can you give me a very simply Solidity contract example? Here is a completely impractical decentralized world-population counter in Solidity: Basically, we have created a decentralized storage variable of type unsigned int, which is part of the blockchain. No matter which node you choose ...

Software Development in a Connected World

Search This Blog