A Blockchain is

A distributed database that is used to maintain a continuously growing list of records, called block.

A Block contains following information:

  • Index(Block #) - Which block is it? (Genesis block has index 0)

  • Hash: Is the block valid?

  • Previous Hash: It the previous block valid?

  • Timestamp: When was the block added?

  • Data: What information is stored on the block?

  • Nonce: How many iterations did we go through before we found a valid block?

Genesis Block

Every Blockchain will st art with the Genesis Block, each block on the blockchain is dependent on the previous block, so the Genesis Block is needed to mine our first block.

Hash

A hash value is a numeric value of a fixed length that uniquely identifies data.

The hash is calculated by taking the index, previous block hash, timestamp, block data, and nonce as input.

1
CryptoJS.SHA256(index + previousHash + timestamp + data + nonce)

Leading 0 in the block hash

Specified leading 0 is a minimum requirement for a valid hash. The number of leading 0 required is called difficulty.

The is also known as the Proof-of-Work system.

Nonce

A nonce is a number used to find a valid hash

1
2
3
4
5
6
7
8
9
let nonce = 0
let hash
let input

while (!isValidHashDifficulty(hash)) {
nonce += 1
input = index + previousHash + timeStamp + data + nonce
hash = CryptoJS.SHA256(input)
}

The nocne iterates until the hash is valid.

The process of finding a nonce that corresponds to a valid hash is mining.

As the difficulty increase, the number of possible valid hashes decrese. With less possible valid hashes, it takes more processing power to find a valid hash.

Ethereum Virtual Machine (EVM)

The Ethereum Virtual Machine or EVM is the runtime environment for smart contracts in Ethereum. It is not only sandboxed but actually completely isolated, which means that code running inside the EVM has no access to network, filesystem or other processes. Smart contract even have limited access to other smart contracts.

Accounts

There are two kinds of accounts in Ethereum which share the same address space:

  • External Account that are controlled by public-private key pairs(i.e. humans)

  • Contract Account that are controlled by the code stored together with accounts

The address of an external account is determined from the public key while the address of a contract is determined at the time the contract is created(it is derived from the creator address and the number of transactions sent from the address, the so-called ‘nonce’).

Every account has a persistent key-value store mapping 256-bit words to 256-bit words called storage.

Every account has a balance in Ether(in Wei to be exact) which can be modified by sending transactions that include Ether.

transactions

A transaction is a message that is sent from one account to another account(which might be the same or the special zero-account). It can include binary data(its payload) and Ether.

If the target account contains code, that code is executed and the payload is provided as input data.

If the target account is the zero-account(the account with the address 0), the transaction creates a new contract.

As memtioned, the address of that contract is not the zero address but an address derived from the sender and its number of transactions sent(nonce). The payload of such a contract creation transaction is taken to be EVM bytecode and executed to generate the new contract

The output of the bytecode, namely the new contract, is permenantly stored.

Gas

Each transaction is charged with a certain with a certain amount of gas, whose purpose is to limit the amount of work that is needed to executed the transaction and to pay for this execution.

While the EVM executes the transaction, the gas is gradually depleted according to specific rules.

The gas price is a value set by the creator of the transaction, who has to pay gas_price * gas.

Storage, Memory and the Stack

Each account has a persistent memory area which is called storage. Storage is a key-value store taht maps 256-bit word to 256-bit word.

The second memory area is called memory, of which a contract obtains a freshly cleared instance for each message call. Memory is linear and can be addressed at byte level, but reads are limited to a width of 256 bits, white writes can be either 8btis or 256bits wide.

Memory is expanded by a word(256-bit), when accessing (either reading or writing) a previously untouched memory word. At the time of expansion, the cost in gas must be paid. Memory is more costly the larger it grows(it scales quadratically).

The EVM is not a register machine but a stack machine, so all computations are performed on an area called the stack. It has limited to the top end in the following way:

  • it is possible to copy one of the topmost 16 elements to the top of the stack or swap the topmost 2(or 1, or more, depending on the opeartion) element fomr the stack and push the result onto the stack.

  • All other operation take the topmost 2(or 1, or more, depending on the operation) elements from the stack and push the result onto the stack.

  • It is possible to move stack elements to storage or memory

  • It is not possible to just access arbitrary elements deeper in the stack without removing the top of the stack.

Instruction Set

The instruction set of the EVM is kept minimal in order to avoid incorrect implementations which could cuase consensus problem. All instructions operate ont he basic data type, 256-bit words.

The usual arihmetic bit, logical and comparison operations are present. Conditional and unconditional jumps are possible. Furthermore, contracts can access relevant properties of the current block like its number and timestamp.

Message Calls

Contracts can call other contracts or send Ether to non-contract accounts by the means of the message calls.

Message calls are similar to transactions, in that tehy have a source, a target, data payload, Ether, gas, and return data.

In fact, every transaction consists of a top-level message call which in turn can create further message calls.

A contract can decide how much of its remaining gas should be sent with the inner message call and how much it wants to retain. If an out-of-gas exception happens in the inner call(or any other exception), this will be signalled by an error value put onto the stack. In this case, only the gas sent together with the call is used up.

Summarily, a called contract(which can be the same as caller) will receive a freshly cleared instance of memoery, and has access to the call payload - which will be provided in a separate area called the calldata. After it has finished execution, it can return data which will be stored at a location in the caller’s memory preallocated by the caller.

Calls are limited to depth of 1024, which means that for more complex operations, loops should be preferred over recursive calls.

Delegatecall / Callcode and Libraries

There exists a special variant of a message call, named delegatecall which is identical to a message call apart from the fact taht the code at the target address is executed in the context of the calling contract and msg.sender and msg.value do not change their values.

This means that a contract can dynamically load code from a different address at runtime. Storage, current address and balance still refer to the calling contract, only the code is taken from teh called address.

This makes it possible to implement the ‘library’ feature in Solidity: Reusable library code that can be applied to a contract’s storage to implement a complex data structure.

Logs

It is possible to store data in a special indexed data structure that maps all the way up to the block level. This feature called logs is used by Solidity in order to implement events. Contracts cannot access log data after it has been created, but they can efficiently accessed from outside the blockchain.

Since some part of the log data is stored in bloom filters, it is possible to search for this data in an efficient and crypographically secure way, so network peers that do not download the whole blockchain(‘light clients’) can still find these logs.

Create

Contracts can even create other contract s using a special opcode. The only difference between create calls and normal message calls is that the payload data is executed and the result stored as code and the caller/creator receives the address of the new contract on the stack.

Self-destruct

The only possible that code is removed from the blockchain is when a contract at that address performs the selfdestruct operation. The remaining Ether stored at that address is sent to a designed target and then the storage and code is removed from the state.

Even if a contract’s code does not contain a call to selfdestruct, it can still perform that operation using delegatecall or callcode.