Liu Qi's Blog

Blockchain and Bitcoin (5)

blockchain ethereum solidity compiler smart-contract

1. Introduction

This post lists some resources on how to create smart contract from scratch. To browse online smart contracts, Etherscan and Etherchain are two good websites.

2. Programming Resources

There is an excellent tutorial about how to program smart contract using Solidity here Smart Contract.

3. Hello World with Geth

This page will help you build a Hello, World contract with the ethereum Geth. The code is taken here for future reference:

4. Mix IDE

Here is a post on how to program and debug with Mix IDE.

Read More

Blockchain and Bitcoin (4)

blockchain ethereum transaction data-structure state-machine

1. Introduction

Ethereum is becoming a popular altnative coin of Bitcoin. Ethereum implements blockchain in a generalised manner. Furthermore it provides a plurality of resources, each with a distinct state and operating code but able to interact through a message-passing framework with others. The smart contract scheme allows users to customise transaction procedures and enable blockchain of sophisticated business logics. In this post, we will talk about the data structures and its extensions from Bitcoin.

2. Bitcoin As A State Transition System

From a technical standpoint, the Bitcoin ledger can be thought of as a state transition system, where there is a “state” consisting of the ownership status of all existing bitcoins and a “state transition function” that takes a state and a transaction and outputs a new state which is the result. Here is the fomulation:

APPLY(S, TX)->S'

For example:

APPLY({ Alice: $50, Bob: $50 }, "send $20 from Alice to Bob") = { Alice: $30, Bob: $70 }

The “state” in Bitcoin is the collection of all coins (technically, “unspent transaction outputs” or UTXO) that have been minted and not yet spent, with each UTXO having a denomination and an owner. A transaction contains one or more inputs, with each input containing a reference to an existing UTXO and a cryptographic signature produced by the private key associated with the owner’s address, and one or more outputs, with each output containing a new UTXO to be added to the state.

The state transition function APPLY(S,TX) ­> S’ can be defined roughly as follows:

  1. For each input in TX: i. If the referenced UTXO is not in S, return an error. ii. If the provided signature does not match the owner of the UTXO, return an error.
  2. If the sum of the denominations of all input UTXO is less than the sum of the denominations of all output UTXO, return an error.
  3. Return S with all input UTXO removed and all output UTXO added.

3. Currency unit of Ethereum

Ethereum has an intrinsic currency, Ether, known also as ETH. The smallest subdenomination of Ether, and thus the one in which all integer values of the currency are counted, is the Wei. One Ether is defined as being 1018 Wei. Here is a summarised list of units:

4. Ethereum Blocks, State and Transactions

4.1 World States

The world state (state), is a mapping between addresses (160-bit identifiers) and account states. The account state comprises the following four fields:

  1. nonce: A scalar value equal to the number of transactions sent from this address or, in the case of accounts with associated code, the number of contract-creations made by this account.

  2. balance: A scalar value equal to the number of Wei owned by this address.

  3. storageRoot: A 256-bit hash of the root node of a Merkle Patricia tree that encodes the storage contents of the account (a mapping between 256-bit integer values), encoded into the trie as a mapping from the Keccak 256-bit hash of the 256-bit integer keys to the RLP-encoded 256-bit integer values.

  4. codeHash: The hash of the EVM code of this account—this is the code that gets executed should this address receive a message call; it is immutable and thus, unlike all other fields, cannot be changed after construction. All such code fragments are contained in the state database under their corresponding hashes for later retrieval.

4.2 Transactions

There are two types of transactions: those which result in message calls and those which result in the creation of new accounts with associated code (known informally as ‘contract creation’). Both types specify a number of common fields:

  1. nonce: A scalar value equal to the number of transactions sent by the sender.

  2. gasPrice: A scalar value equal to the number of Wei to be paid per unit of gas for all computation costs incurred as a result of the execution of this transaction.

  3. gasLimit: A scalar value equal to the maximum amount of gas that should be used in executing this transaction. This is paid up-front, before any computation is done and may not be increased later.

  4. to: The 160-bit address of the message call’s recipient or, for a contract creation transaction, the to is 0.

  5. value: A scalar value equal to the number of Wei to be transferred to the message call’s recipient or, in the case of contract creation, as an endowment to the newly created account.

  6. v, r, s: Values corresponding to the signature of the transaction and used to determine the sender of the transaction.

Additionally, a contract creation transaction contains: init: An unlimited size byte array specifying the EVM-code for the account initialisation procedure.

In contrast, a message call transaction contains: data: An unlimited size byte array specifying the input data of the message call.

4.3 Blocks

The block in Ethereum is the collection of relevant pieces of information (known as the block header), H, together with information corresponding to the comprised transactions, T, and a set of other block headers U that are known to have a parent equal to the present block’s parent’s parent (such blocks are known as ommers). The block header contains several pieces of information:

  1. parentHash: The Keccak 256-bit hash of the parent block’s header, in its entirety.

  2. ommersHash: The Keccak 256-bit hash of the ommers list portion of this block.

  3. beneficiary: The 160-bit address to which all fees collected from the successful mining of this block be transferred.

  4. stateRoot: The Keccak 256-bit hash of the root node of the state trie, after all transactions are executed and finalisations applied.

  5. transactionsRoot: The Keccak 256-bit hash of the root node of the trie structure populated with each transaction in the transactions list portion of the block.

  6. receiptsRoot: The Keccak 256-bit hash of the root node of the trie structure populated with the receipts of each transaction in the transactions list portion of the block.

  7. logsBloom: The Bloom filter composed from indexable information (logger address and log topics) contained in each log entry from the receipt of each transaction in the transactions list.

  8. difficulty: A scalar value corresponding to the difficulty level of this block. This can be calculated from the previous block’s difficulty level and the timestamp.

  9. number: A scalar value equal to the number of ancestor blocks. The genesis block has a number of zero.

  10. gasLimit: A scalar value equal to the current limit of gas expenditure per block.

  11. gasUsed: A scalar value equal to the total gas used in transactions in this block.

  12. timestamp: A scalar value equal to the reasonable output of Unix’s time() at this block’s inception.

  13. extraData: An arbitrary byte array containing data relevant to this block. This must be 32 bytes or fewer.

  14. mixHash: A 256-bit hash which proves combined with the nonce that a sufficient amount of computation has been carried out on this block.

  15. nonce: A 64-bit hash which proves combined with the mix-hash that a sufficient amount of computation has been carried out on this block.

5. Gas and Payment

In order to avoid issues of network abuse and to sidestep the inevitable questions stemming from Turing completeness, all programmable computation in Ethereum is subject to fees. The fee schedule is specified in units of gas. Thus any given fragment of programmable computation (this includes creating contracts, making message calls, utilising and accessing account storage and executing operations on the virtual machine) has a universally agreed cost in terms of gas.

Every transaction has a specific amount of gas associated with it: gasLimit. This is the amount of gas which is implicitly purchased from the sender’s account balance. The purchase happens at the according gasPrice, also specified in the transaction. The transaction is considered invalid if the account balance cannot support such a purchase. It is named gasLimit since any unused gas at the end of the transaction is refunded (at the same rate of purchase) to the sender’s account. Gas does not exist outside of the execution of a transaction. Thus for accounts with trusted code associated, a relatively high gas limit may be set and left alone.

In general, Ether used to purchase gas that is not refunded is delivered to the beneficiary address, the address of an account typically under the control of the miner. Transactors are free to specify any gasPrice that they wish, however miners are free to ignore transactions as they choose. A higher gas price on a transaction will therefore cost the sender more in terms of Ether and deliver a greater value to the miner and thus will more likely be selected for inclusion by more miners. Miners, in general, will choose to advertise the minimum gas price for which they will execute transactions and transactors will be free to canvas these prices in determining what gas price to offer. Since there will be a (weighted) distribution of minimum acceptable gas prices, transactors will necessarily have a trade-off to make between lowering the gas price and maximising the chance that their transaction will be mined in a timely manner.

6. Execution Model

The EVM is a simple stack-based architecture. The word size of the machine (and thus size of stack item) is 256-bit. This was chosen to facilitate the Keccak-256 hash scheme and elliptic-curve computations. The memory model is a simple word-addressed byte array. The stack has a maximum size of 1024. The machine also has an independent storage model; this is similar in concept to the memory but rather than a byte array, it is a wordaddressable word array. Unlike memory, which is volatile, storage is non volatile and is maintained as part of the system state. All locations in both storage and memory are well-defined initially as zero. The machine does not follow the standard von Neumann architecture. Rather than storing program code in generally-accessible memory or storage, it is stored separately in a virtual ROM interactable only through a specialised instruction.

7. Final remarks

This post introduces the data structures and extensions of Ethereum. In next post, we will talk about how to program smart contract with Ethereum.

Read More

Blockchain and Bitcoin (3)

blockchain bitcoin transaction block-format data-structure

1. Introduction

In this post, we will look at real data structures of blocks and transactions, real scripts Bitcoin use. Let’s start with transaction structures. The figure below shows the data structure of a transaction. To view the real transactions happening in Bitcoin network, some websites can be referred to such as blockchain.info. As we can see from the figure, there are three parts to a transaction: some metadata, a series of inputs, and a series of outputs.

  1. Metadata. There’s some housekeeping information — the size of the transaction, the number of inputs, and the number of outputs. There’s the hash of the entire transaction which serves as a unique ID for the transaction. That’s what allows us to use hash pointers to reference transactions. Finally there’s a “lock_time” field. The way it works is that if you specify any value other than zero for the lock time, it tells miners not to publish the transaction until the specified lock time. The transaction will be invalid before either a specific block number, or a specific point in time, based on the timestamps that are put into blocks. So this is a way of preparing a transaction that can only be spent in the future if it isn’t already spent by then.

  2. Inputs. The transaction inputs form an array, and each input has the same form. An input specifies a previous transaction, so it contains a hash of that transaction, which acts as a hash pointer to it. The input also contains the index of the previous transaction’s outputs that’s being claimed. And then there’s a signature script: scriptSig which we will talk about it later. Remember that we have to sign to show that we actually have the ability to claim those previous transaction outputs.

  3. Outputs. The outputs are again an array. Each output has just two fields. They each have a value, and the sum of all the output values has to be less than or equal to the sum of all the input values. If the sum of the output values is less than the sum of the input values, the difference is a transaction fee to the miner who publishes this transaction. The other field is scriptPubKey script.

2. Bitcoin Scripts

In this section, we will talk about the two scripts: scriptSig and scriptPubKey.

The most common type of transaction in Bitcoin is to redeem a previous transaction output by signing with the correct key. In this case, we want the transaction output to say, “this can be redeemed by a signature from the owner of address X.” An address is a hash of a public key. So merely specifying the address X doesn’t tell us what the public key is, and doesn’t give us a way to check the signature. So instead the transaction output must say “this can be redeemed by a public key that hashes to X, along with a signature from the owner of that public key.”

We concatenate the two scripts, and the resulting script must run successfully in order for the transaction to be valid. In the simplest case, the scriptPubKey just specifies a public key (or an address to which the public key hashes), and scriptSig specifies a signature with that public key.

An example of combined script is shown above. The first part is scriptSig and the second part is scriptPubKey. The script is a stack based language and it does not have loop in it. Thus, we just need to run it line by line to check its validity. There are only two possible outcomes when a Bitcoin script is executed. It either executes successfully with no errors, in which case the transaction is valid. Or, if there’s any error while the script is executing, the whole transaction will be invalid and shouldn’t be accepted into the block chain. Some commonly used commands are shown below:

To execute a script in a stack-based programming language, all we’ll need is a stack that we can push data to and pop data from. The figure below illustrates what happens to the stack after each operation:

The Bitcoin scripting language is very small. There’s only room for 256 instructions, because each one is represented by one byte. Of those 256, 15 are currently disabled, and 75 are reserved. The reserved instruction codes haven’t been assigned any specific meaning yet, but might be instructions that are added later in time. Although this stack based language is not Turing-complete, it still makes Bitcoin more programmable and flexible. It has many applications such green address, escrow transactions, smart contract…

3. Block Structure

The block chain is a clever combination of two different hash-based data structures. The first is a hash chain of blocks. Each block has a block header, a hash pointer to some transaction data, and a hash pointer to the previous block in the sequence. The second data structure is a per-block tree of all of the transactions that are included in that block. This is a Merkle tree and allows us to have a digest of all the transactions in the block in an efficient way. The figure below shows an example:

A block in block chain mainly has a header and a list of transactions. The format of the header is shown below:

The transactions are accumulated in 10 minutes and put into one block for efficiency reasons. There is a special type of transaction called coinbase which is the transaction that generates new Bitcoins. It mostly looks like a normal transaction but with several differences: (1) it always has a single input and a single output, (2) the input doesn’t redeem a previous output and thus contains a null hash pointer, since it is minting new bitcoins and not spending existing coins, (3) the value of the output is currently a little over 25 Bitcoins. The output value is the miner’s revenue from the block. It consists of two components: a flat mining reward, which is set by the system and which halves every 210,000 blocks (about 4 years), and the transaction fees collected from every transaction included in the block. (4) There is a special “coinbase” parameter, which is completely arbitrary — miners can put whatever they want in there.

4. Final remarks

This post introduces how the data structures of Bitcoin transactions and blocks. In next post, we will talk about the security and applications of Blockchain.

Read More

Blockchain and Bitcoin (2)

blockchain bitcoin consensus-algorithm double-spend-attack decentralisation

1. Introduction

We will discuss decentralization of Bitcoin in this post. The scheme is first proposed in the whitepaper published by Satoshi. In peer-to-peer cryptocurrency systems, lots of obstacles arise compared to centralised systems such as latency of network, double-spend-attack, loss of data etc. Because of those constraints, Bitcoin does not purely rely on technical methods to achieve decentralisation, but it’s a combination of technical methods and clever incentive engineering. In this post, we first give the simplied protocol Bitcoin used and then elaborate on this protocol.

2. Distributed consensus protocol

Imagine you’re in charge of the backend for a large social networking company like Facebook. Systems of this sort typically have thousands or even millions of servers, which together form a massive distributed database that records all of the actions that happen in the system. Each piece of information must be recorded on several different nodes in this backend, and the nodes must be in sync about the overall state of the system. Thus, a consensus protocol is needed in this context. A distributed consensus protocol should have two properties as discussed below:

Distributed consensus protocol: There are n nodes that each have an input value. Some of these nodes are faulty or malicious. A distributed consensus protocol has the following two properties:

  1. It must terminate with all honest nodes in agreement on the value

  2. The value must have been generated by an honest node

Thus, distributed consensus protocol plays a central role in the design of distributed systems. However, the lack of global time heavily constrains the set of algorithms that can be used in the consensus protocols. In fact, because of these constraints, much of the literature on distributed consensus is somewhat pessimistic, and many impossibility results have been proven. One very well known impossibility result is the Byzantine Generals Problem.

2.1 Blockchain protocol

Bitcoin uses Blockchain to record the transactions. Thus, Blockchain needs protocol to sync the blocks for nodes in the network. Suppose there is somehow an ability to pick a random node in the system (in Bitcoin, this random node is picked by a procedure called mining), this node gets to propose the next block in the chain. Here is the simplified protocol.

Bitcoin consensus algorithm (simplified) This algorithm is simplified in that it assumes the ability to select a random node in a manner that is not vulnerable to Sybil attacks.

  1. New transactions are broadcast to all nodes

  2. Each node collects new transactions into a block

  3. In each round a random node gets to broadcast its block

  4. Other nodes accept the block only if all transactions in it are valid (unspent, valid signatures)

  5. Nodes express their acceptance of the block by including its hash in the next block they create

By this manner, the system is robust enough to prevent potential attacks which we will talk about later. First, we will talk about how other nodes process the block they received a block(step 4 and step 5).

2.1.1 Receive block

When receiving a new block message broadcasted from step 3, the node should extend the blockchain they are maintaining. There are three kinds of blocks in the blockchain:

  1. blocks in the main branch. The transactions in these blocks are considered at least tentatively confirmed

  2. blocks on side branches off the main branch. These blocks have at least tentatively lost the race to be in the main branch

  3. orphan blocks. These are blocks which don’t link into the main branch, normally because of a missing predecessor or nth-level predecessor.

The largest branch in the blockchain is called the main branch. When receiving a new block, the node will try to add this into main branch, side branches or orphan block pool depending on the hash pointer. After adding it, three cases will arise:

  1. block further extends the main branch;

  2. block extends a side branch but does not add enough difficulty to make it become the new main branch;

  3. block extends a side branch and makes it the new main branch.

Depending on which case happens, the blockchain takes different actions. An overview of the actions is available at protocol.

2.1.2 Advantages

The protocol can prevent many kinds of attacks. Let’s now try to understand why this consensus algorithm works. To do this, let’s consider how a malicious adversary — who we’ll call Alice — may be able to subvert this process.

Stealing Bitcoins. Can Alice simply steal bitcoins belonging to another user at an address she doesn’t control? No. Even if it is Alice’s turn to propose the next block in the chain, she cannot steal other users’ bitcoins. Doing so would require Alice to create a valid transaction that spends that coin. This would require Alice to forge the owners’ signatures which she cannot do if a secure digital signature scheme is used. So as long as the underlying cryptography is solid, she’s not able to simply steal bitcoins.

Denial of service attack. Let’s consider another attack. Say Alice really dislikes some other user Bob. Alice can then decide that she will not include any transactions originating from Bob’s address in any block that she proposes to get onto the block chain. In other words, she’s denying service to Bob. While this is a valid attack that Alice can try to mount, luckily it’s nothing more than a minor annoyance. If Bob’s transaction doesn’t make it into the next block that Alice proposes, he will just wait until an honest node gets the chance to propose a block and then his transaction will get into that block. So that’s not really a good attack either.

Double-spend attack. Let’s assume that Alice is a customer of some online merchant or website run by Bob, who provides some online service in exchange for payment in bitcoins. Let’s say Bob’s service allows the download of some software. So here’s how a double-spend attack might work. Alice adds an item to her shopping cart on Bob’s website and the server requests payment. Then Alice creates a Bitcoin transaction from her address to Bob’s and broadcasts it to the network. Let’s say that some honest node creates the next block, and includes this transaction in that block. So there is now a block that was created by an honest node that contains a transaction that represents a payment from Alice to the merchant Bob. The graph below shows the effect to Blockchain.

And how do we know if this double spend attempt is going to succeed or not? Well, that depends on which block will ultimately end up on the long-term consensus chain — the one with the Alice → Bob transaction or the one with the Alice → Alice transaction. What determines which block will be included? Honest nodes follow the policy of extending the longest valid branch, so which branch will they extend? There is no right answer! At this point, the two branches are the same length — they only differ in the last block and both of these blocks are valid. The node that chooses the next block then may decide to build upon either one of them. Thus, only one of two transactions will finally be kept in the blockchain. Alice cannot achieve double-spend attack in this context. The merchart Bob can wait confirmations for transactions to decide whether let Alice download the software. Thus, the protocol can prevent double-spend attack.

3. Pick up a random node

In this section, we will talk about how to randomly choose a node from the network. Nodes in the network are rewarded by being chosen as this node. Thus, this motivates people to get the opportunity for being chosen through ‘mining’. The rewards include the creation of new coins and transaction fees. At the time of this writing, the value of the block reward is fixed at 25 Bitcoins. But it actually halves every 210,000 blocks. Based on the rate of block creation that we will see shortly, this means that the rate drops roughly every four years (because in expection, a new block is generated every 10 minutes). We’re now in the second period. For the first four years of Bitcoin’s existence, the block reward was 50 bitcoins; now it’s 25. The node chosen will get 25 newly created Bitcoins.

Now, we introduce how the node is chosen through ‘mining’. The node is chosen through proof-of-work. The key idea behind proof-of-work is that we approximate the selection of a random node by instead selecting nodes in proportion to a resource that we hope that nobody can monopolize. If, for example, that resource is computing power, then it’s a proof-of-work system.

Bitcoin achieves proof-of-work using hash puzzles. In order to create a block, the node that proposes that block is required to find a number, or nonce, such that when you concatenate the nonce, the previous hash, and the list of transactions that comprise that block and take the hash of this whole string, then that hash output should be a number that falls into a target space that is quite small in relation to the much larger output space of that hash function. We can define such a target space as any value falling below a certain target value. In this case, the nonce will have to satisfy the inequality:

H(nonce || prev_hash || tx || tx || ... || tx) < target.

The first node which finds this nonce proposes the block. Thus, in order to be chosen, the node should have a great computing power. The incentive to get Bitcoins flourishes the Bitcoin mining industry.

4. Final remarks

This post introduces how Bitcoin achieves concensus and how to choose random node to propose next block. In next post, we will introduce the data format of blocks.

Read More

Blockchain and Bitcoin (1)

blockchain bitcoin data-structure cryptography

1. Introduction

Bitcoin is a peer-to-peer digital asset and a payment system. The system is invented by Satoshi Nakamoto and released as an open-source software in 2009.

Bitcoins are created as a reward for payment processing work in which users offer their computing power to verify and record payments into a public ledger. This activity is called mining and miners are rewarded with transaction fees and newly created bitcoins. Besides being obtained by mining, bitcoins can be exchanged for other currencies, products, and services.

Bitcoin is revolutionising they way people pay and exchange products. In former days, currency is created and exchanged based on trusted institutions. The central bank issues currency and people make their payments via various ways such as commercial bank, PayPal, Alipay, Apple Wallet, etc. This centralised scheme has its own problems like counterfeit. On the contrary, In bitcoin system, transactions take place between users directly, without an intermediary. These transactions are verified by network nodes and recorded in a public distributed ledger called the blockchain.

This blog series is based on some excellent books and tutorials such as Master Bitcoin and Bitcoin and Cryptocurrency Technologies. The official documentation of bitcoin is also very helpful.

In the first post, we give some introduction on data structures and cryptography concepts used in blockchain and bitcoin.

2. Cryptographic Hash function

2.1 Properties

Cryptographic Hash function is used often in blockchain. For a hash function, it takes variable length input x and map it to a fixed size output H(x) (e.g. 256 bits for MD5 and SHA-256). A hash function must have three properties to become a cryptographic hash function:

  1. Collision-resistance: A hash function H is said to be collision resistant if it is infeasible to find two values, x and y , such that x != y , yet H(x)= H(y).

  2. Hiding: A hash function H is hiding if: when a secret value r is chosen from a probability distribution that has high entropy, then given H(r ‖ x) it is infeasible to find x. means concatenation of two strings.

  3. Puzzle friendliness. A hash function H is said to be puzzle-friendly if for every possible n-bit output value y , if k is chosen from a distribution with high entropy, then it is infeasible to find x such that H(k ‖ x) = y in time significantly less than 2n.

Given three properties, those hash functions can have a lot of practical applications. Hiding and puzzle friendliness guaranttee that it is infeasible to reverse-engineer intput x given hash value H(x). In Blockchain system, SHA-256 is the hardcore of the system. Given a data block, SHA-256 maps the data block to 256 bits. According to collision-resistance, since it is infeasible to find two different values that are mapped to the same hash value, we can safely use the 256 bits as the ID of the data block. SHA-256 is generated by a procedure called Merkle-Damgard transform. The procedure is shown below:

2.2 Hash Pointers

A hash pointer is simply a pointer to where some information is stored together with a cryptographic hash of the information. Whereas a regular pointer gives you a way to retrieve the information, a hash pointer also gives you a way to verify that the information hasn’t changed. In regular pointer such as in C/C++, the pointer stores the variable address. In a hash pointer, it simply stores the hash value H(x) of the data x it points to. An example is shown below.

In practice, the hash pointers can be indexed by database or structures like hash table to support efficient retrieval. Given hash pointers, we can use it to build all kinds of data structures. Intuitively, we can take a familiar data structure that uses pointers such as a linked list or a binary search tree and implement it with hash pointers, instead of pointers as we normally would. Here is two examples on blockchain and merkle tree.

2.2.1 Blockchain

Blockchain is just a linked list using hash pointers. Whereas as in a regular linked list where you have a series of blocks, each block has data as well as a pointer to the previous block in the list, in a block chain the previous block pointer will be replaced with a hash pointer. So each block not only tells us where the value of the previous block was, but it also contains a digest of that value that allows us to verify that the value hasn’t changed. We store the head of the list, which is just a regular hash-pointer that points to the most recent data block.

This structures have the property that it is a tamper-evident log, i.e., if the attacher changes the data in the middle, he has to continue changing the hash pointers until the root of the blockchain. In a distributed environment, if others store a copy of the root of the blockchain, it can easily detect the blockchain in the attacker’s machine has been tampered.

2.2.2 Merkle trees

Another useful data structure that we can build using hash pointers is a binary tree. A binary tree with hash pointers is known as a Merkle tree, after its inventor Ralph Merkle. Suppose we have a number of blocks containing data. These blocks comprise the leaves of our tree. We group these data blocks into pairs of two, and then for each pair, we build a data structure that has two hash pointers, one to each of these blocks. These data structures make the next level up of the tree. We in turn group these into groups of two, and for each pair, create a new data structure that contains the hash of each. We continue doing this until we reach a single block, the root of the tree. For example, suppose we have three data blocks, a, b, c. We can generate a merkle tree by doing the following steps:

d1 = dhash(a)

d2 = dhash(b)

d3 = dhash(c)

d4 = dhash(c) #since we have odd number of blocks, we append it with another c

d5 = dhash(d1 ‖ d2)

d6 = dhash(d3 ‖ d4)

d7 = dhash(d5 ‖ d6)

Here is graphical example of merkle tree with 8 data blocks.

Based on the structure of merkle, it can check the membership of data in O(logn) time and O(logn) space. Here is a nice post from Quora illustrating why.

3. Signatures

A digital signature is supposed to be the digital analog to a handwritten signature on paper. We desire two properties from digital signatures that correspond well to the handwritten signature analogy. Firstly, only you can make your signature, but anyone who sees it can verify that it’s valid. Secondly, we want the signature to be tied to a particular document so that the signature cannot be used to indicate your agreement or endorsement of a different document. For handwritten signatures, this latter property is analogous to assuring that somebody can’t take your signature and snip it off one document and glue it onto the bottom of another one.

In practice, the signature is usually implemented as private-public key encryption. The popular algorithm is RSA. A digital signature scheme consists of the following three algorithms:

  1. (sk, pk) := generateKeys(keysize) The generateKeys method takes a key size and generates a key pair. The secret key sk is kept privately and used to sign messages. pk is the public verification key that you give to everybody. Anyone with this key can verify your signature.
  2. sig := sign(sk, message) The sign method takes a message, msg, and a secret key, sk, as input and outputs a signature for the msg under sk.
  3. isValid := verify(pk, message, sig) The verify method takes a message, a signature, and a public key as input. It returns a boolean value, isValid, that will be true if sig is a valid signature for message under public key pk, and false otherwise.

We require that the following two properties hold:

  1. Valid signatures must verify: verify(pk, message, sign(sk, message))== true

  2. Signatures are existentially unforgeable

With signature, the receiver of the message can verify the validity.

4. Final remarks

Given the building blocks of blockchain and bitcoin, in next post, we will introduce the consensus and decentralisation schemes of blockchain and other more advanced features.

Read More