Ethereum Smart Contract Development
上QQ阅读APP看书,第一时间看更新

EVM

EVM is a stack-based interpreter, which has a memory byte array and key-value storage. To visualize a stack, let us return to our brunch buffet analogy and think about how the clean plates are kept one above another.

Figure 2.10 illustrates such a stack as a last-in-first-out (LIFO) process with its crucial operations like pop and push:

Figure 2.10: Stack operation in nutshell

Elements on the EVM stack are 32 bytes long and all key-value storage are also 32 bytes. Smart contracts, which are coded in high-level languages, run on blockchain through EVM, which generates machine level operational codes (opcodes) during runtime. These opcodes have access to three types of space to store data:

  • The stack, a LIFO container to which values can be pushed and popped.
  • Memory, an infinitely expandable byte array.
  • The contract's long-term storage is a key-value store. Unlike stack and memory, which reset after computation ends, storage persists for the long term.

The code can also access the value, sender, and data of the incoming message, as well as block header data, and the code can also return a byte array of data as an output. There are around 100 opcodes, which are categorized in delineated multiples of 16. The opcodes have a general format as follows:

#schema: [opcode, ins, outs, gas]
e.g. #crypto 0x20: ['SHA3', 2, 1, 30] 

The preceding format tells us how many parameters each opcode pops out of the stack and pushes back into the stack, as well as a counter of how much gas is consumed. We will discuss gas and the meaning of Turing completeness in the upcoming sections.

Figure 2.11 shows how the opcodes are broadly segregated into nine different groups based on their functionality:

Figure 2.11: EVM opcode categorization for python based Ethereum client

The EVM is designed to permit untrusted code from running over a global public blockchain based operating system.

In order to accomplish this, the following security restrictions are imposed:

  • Every computational state of execution in a program is to be paid up front, which in turn prevents denial-of-service attacks.
  • Codes only interact with each other by transmitting a single, arbitrary-length byte array. The programs do not have access to each other's execution state.
  • Program execution is isolated. A virtual machine program may access and modify only its own internal state and may trigger the execution of other programs on EVM.
  • Code execution is fully deterministic in nature and produce identical state transitions for any conforming implementation, which began in an identical state.

The formal execution model of EVM code is surprisingly simple. While the EVM is running, its full computational state can be defined by the tuple (block_state, message, code, memory, transaction, stack, gas, and pc), where block_state is the global state, containing all accounts with balances and storage.

At the start of every execution round, the current instruction is found by fetching the program counter (pc) code byte, and each instruction has its own definition in terms of how it affects the tuple.

For example, an ADD operation pops two items from the top of the stack and pushes their sum, reduces gas by one, and increments pc by one; while store pops the top two items off the stack and inserts the second item into the contract's storage at the specified index of the first item. Even though there are many ways to optimize EVM execution via just-in-time compilation, a basic Ethereum implementation can be done in a few hundred lines of code.