EVM Deep Dives: The Path to Shadowy Super Coder 🥷 💻 - Part 5
Another Weapon in the EVM Hackers Arsenal - The Power of Delegate Call
If you’re interested in EVM Hacking you’re going to enjoy this installment of the “EVM Deep Dives” series.
Today we’re going to take a close look at the opcodes CALL & DELEGATECALL. This episode builds on concepts covered in Part 2, Part 3 & Part 4, I recommend reading those if you haven’t yet.
We’re going to cover how these opcodes work at the solidity level, the EVM level and the Geth client level to give you a complete understanding.
Before we dig into each of these we need to first understand the concept of a contract execution context.
Execution Context
When the EVM executes a smart contract, a context is created for it. The context consists of the following.
The Code
The contract bytecode which is immutable, it is stored on-chain and referenced using a contract address.
The Stack
The call stack, an empty stack is initialised for each EVM contract execution.
The Memory
The contract memory, a clean memory is initialised for each EVM contract execution.
The Storage
The contract storage which is persisted across executions, it is stored on-chain and is referenced via a contract address and its storage slot.
The Call Data
The input data for a transaction.
The Return Data
The data returned from a contract function call.
Keep these items in your mind as we proceed through the article.
We will start with a DELEGATECALL Solidity example from the Smart Contract Programmer and reference it throughout.
Solidity Example
The diagram below shows the execution of two function calls on the same contract, one which uses DELEGATECALL and the other which uses CALL.
We’ll run through both and compare how they differ.
Let’s start by noting the constants in this interaction (Note if you recreate in remix your addresses will likely be different.)
We have two contracts, Contract A & B and an EOA.
EOA Address = 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4
Contract A Address = 0x7b96aF9Bd211cBf6BA5b0dd53aa61Dc5806b6AcE
Contract B Address = 0x3328358128832A260C76A4141e19E2A943CD4B6D
We’re going to call the 2 functions in Contract A, setVarsDelegateCall & setVarsCall.
We will pass in the parameters Contract B Address, a uint of 12 and a Wei value of 1000000000000000000 (1 ETH).
Delegate Call
An EOA address calls Contract A’s setVarsDelegateCall with Contract B’s address, uint 12 and value 1000000000000000000 Wei. This in turn makes a delegatecall to Contract B’s setVars(uint256) function with uint 12.
The delegatecall executes the setVars(uint256) code from Contract B but updates Contract A’s storage. The execution has the same storage, msg.sender & msg.value as its parent call setVarsDelegateCall.
The values are set in Contract A’s storage, 12 for num, 0x5b38…c4 for sender (EOA Address) & 1000000000000000000 for value. Despite setVars(uint256) being called by Contract A with no value when we check msg.sender & msg.value we get the values from the original setVarsDelegateCall.
After the execution of this function we can check the num, sender & value state items of Contract A & B. We will see that none of the values are initialised in Contract B while all are set in Contract A.
Call
An EOA address calls Contract A’s setVarsCall with Contract B’s address, uint 12 and value 1000000000000000000 Wei. This in turn makes a call to Contract B’s setVars(uint256) function with uint 12.
The standard call executes the setVars(uint256) code from Contract B with no alterations to storage, msg.sender, msg.value.
The values are set in Contract B’s storage, 12 for num, 0x7b96…ce for sender (Contract A Address) & 0 for value. These values correspond with what we expect since setVars(uint256) was called from Contract A and no Wei value was passed into the setVars(uint256) (the 1000000000000000000 Wei was passed into the parent call setVarsCall)
Again after the execution of this function we can check the num, sender & value state items of Contract A & B. We see that the reverse is true this time, none of the values are initialised in Contract A while all are set in Contract B.
Conceptually a “Delegate Call” effectively allows you to copy and paste a function from another contract into your contract. It will be run as if it were executed by your contract and will have access to the same storage, msg.sender & msg.value.
Delegate Call & Storage Layout
In the above example, you may have noticed the comment in the code for Contract B on line 5 stating “NOTE: storage layout must be the same as contract A”.
Remember a function in a contract maps to some static bytecode that is calculated at compile time.
When we look at solidity code we think in terms of variables. We see the state variables num, sender & value.
The compiled bytecode doesn’t see these variables it instead sees storage slots. Declared state variables are mapped to storage slots (If you’re unsure how this works check out Part 3).
If we look at Contract B setVars(uint256) specifically “num = _num” this is saying store value _num into storage slot 0.
When we look at contracts involved in a DELEGATECALL don’t think about the mapping of num → num, sender → sender. That’s not how it works at the bytecode level.
We need to think in terms of mapping slot 0 → slot 0, slot 1 → slot 1
The diagram below shows this mapping, along with the corresponding variable names.
Think what would happen if we were to change the order in which our state variables are defined.
It would change their storage slot positions and subsequently the bytecode associated with the setVars(uint256) function.
If we updated Contract B by switching lines 6 and 8 we would be declaring the “value” state variable first and the “num” state variable last.
This means line 11 “num = _num” in setVars(uint256) would now say store value _num into storage slot 2. Line 13 “value = msg.value” would now say store msg.value in storage slot 0.
This means our variable mappings between Contract A & B would no longer match relative to their storage slots.
When we run DELEGATECALL the “num” value is going to be stored in storage slot 2 for Contract A which maps to the “value” state variable. The same applies for when “value” is stored its going to update slot 0 which maps to the “num” state variable.
This is one of the reasons DELEGATECALL can be dangerous.
Above we’ve accidentally replaced the “num” state variable with the “value” state variable and vice versa.
A Hacker won’t be accidentally changing your state variables. They will be conducting a targeted attack.
Imagine we have a contract that makes an open delegatecall and we know the slot location where the owner of that contract is stored.
We would be able to construct a contract with a state variable layout and function that allows us to update the “owner” slot location with a different address. This would enable us to claim ownership of that contract.
If you’re interested in how these hacks work, take a look at these 2 Ethernaut problems, you have the required knowledge to solve them.
Now let’s jump into the opcodes.
Opcodes
We have a rough idea of how DELEGATECALL works so let’s have a look at the opcodes for DELEGATECALL & CALL.
For DELEGATECALL we have the following input variables;
gas
: amount of gas to send to the sub context to execute. The gas that is not used by the sub context is returned to this one.address
: the account which context to execute.argsOffset
: byte offset in the memory in bytes, the calldata of the sub context.argsSize
: byte size to copy (size of the calldata).retOffset
: byte offset in the memory in bytes, where to store the return data of the sub context.retSize
: byte size to copy (size of the return data).
CALL has exactly the same input variables with one additional value.
Delegate call doesn’t require a value input as it is inherited from its parent call. Recall when we mentioned that the execution context has the same storage, msg.sender & msg.value as its parent call.
Both have one output variable “success” which is 0 if the sub context reverted otherwise it returns 1.
Delegatecall will return success ‘True’ if it is called on an address that is not a contract and so has no code. This can cause bugs if code expects delegatecall functions to return `False` when they can’t execute.
To understand the opcode let’s inspect how DELEGATECALL was executed for the earlier example with Contracts A & B.
DELEGATECALL Opcode Inspection With Remix
Below is a snapshot from the Remix IDE as the DELEGATECALL opcode was called. It corresponds to lines 24 - 26 in the earlier code snippet.
We will look at the items on the stack & memory and see how these values determine the call data that is passed into the DELEGATECALL.
We’ll work our way from the opcode → stack → memory → calldata.
Let’s run through this
On line 24 of the Solidity code, a “delegatecall” is made to Contract B setVars(unit256) with a value of 12. This results in the DELEGATECALL opcode being executed.
The DELEGATECALL opcode takes 6 inputs, gas, address, argsOffset, argsSize, retOffset & retSize which are taken off the stack.
Gas = 0x45eb
Address = 0x3328358128832A260C76A4141e19E2A943CD4B6D (Address for Contract B)
ArgsOffset = 0xc4
ArgsSize = 0x24
RetOffset = 0xc4
RetSize = 0x00
Let’s focus on the argsOffset & argsSize which are the calldata that will be passed to Contract B. These 2 values order us to go to memory location 0xc4 and copy the next 0x24 (36 in Decimal) bytes to get our calldata.
This yields 0x6466414b000000000000000000000000000000000000000000000000000000000000000c which can be split into 0x6466414b which is the function signature for setVars(uint256) and 0x000000000000000000000000000000000000000000000000000000000000000c which is 12 in decimal and represents our input value for num.
This value maps to what is produced by line 25 in the Solidity code abi.encodeWithSignature("setVars(uint256)", _num)
Note the retSize is equal to 0 since setVars(uint256) doesn’t return anything. If it did the retSize value would be updated and the returned value would be stored at the retOffset.
This should give you a good idea of what the opcode is doing under the hood and enable you to connect it back to a real Solidity example.
Now let’s look at the Geth Client Implementation.
A warning, the next section is heavy on the code side.
Geth Implementation
I’m going to focus on a specific part of DELEGATECALL within Geth.
The goal is to show you how the DELEGATECALL opcode differs from the CALL opcode at the storage scope level and how this relates to the SLOAD opcode.
The diagram below looks intimidating but we will run through it step by step. By the end, you’ll understand the subtle implementation differences between DELEGATECALL and CALL in Geth.
We have the DELEGATECALL & CALL opcodes labeled on the left-hand side and the SLOAD opcode labeled bottom right. Let’s see how they’re connected.
Note there are two [ 1 ] ‘s on the diagram. These are the Geth functions for the opcodes DELEGATECALL & CALL found in instructions.go. We can see the values we discussed earlier being popped off the stack into variables. Later in the function, we see that interpreter.evm.DeleagteCall and interpreter.evm.Call are called with the values off the stack, the “to address” and the current contract scope.
Note there are two [ 2 ] ‘s on the diagram. Both the evm.DelegateCall & evm.Call functions are executed (found in evm.go). I’ve omitted sections of the functions to focus on the NewContract function call which is creating a new contract context for us to execute in.
Note there are two [ 3 ] ‘s on the diagram. The NewContract function call for evm.DelegateCall & evm.Call are very similar except for 2 items.
In DelegateCall the value parameter is set to nil, remember it inherits its value from its parent context so doesn’t take in this parameter.
The second input into the NewContract functions are different. In evm.DelegateCall caller.Address( ) is passed in (Contract A address). In evm.Call addrCopy is passed in which is equal to the toAddr from the opCall function (Contract B address). This difference will be very important later. Note both are of type AccountRef.
DelegateCall’s NewContract will return a Contract struct. The AsDelegate( ) function is called (found in contract.go). It sets the msg.sender & msg.value to that of the original call (EOA address & 1000000000000000000 Wei). This is not done on the Call implementation.
Both evm.DelegateCall & evm.Call execute the NewContract function (found in contract.go). Note “object ContractRef” is the second input variable for NewContract which maps to the AccountRef’s we discussed in [ 3 ]
This “object ContractRef” is used along with a number of other values to initialise a Contract. The “object ContractRef” is mapped to “self” in the Contract struct.
The Contract struct (found in contract.go) has the field “self” which is what we are interested in. You can see some of the other fields that relate to items we discussed earlier when talking about contract execution context.
We are now jumping to the SLOAD opcode implementation in Geth (found in instructions.go). It runs GetState on scope.Contract.Address( ). The “Contract” within this statement refers to the Contract struct in [ 7 ].
The implementation of Address( ) for a Contract object (found in contract.go). It in turn calls self.Address( ).
Self is of type ContractRef so type ContractRef must have an Address( ) function.
ContractRef is an interface (found in contract.go) and defines that for something to be a ContractRef it must implement an Address( ) function that returns a common.Address (common.Address is defined as a byte array of length 20, the length of an Ethereum address).
If we refer back to section [ 3 ] we discussed the different AccountRef values in evm.DelegateCall and evm.Call which became “self” on the Contract objects. We can see that AccountRef is actually just a common.Address but it implements an Address( ) function. AccountRef, therefore, meets the ContractRef interface requirements.
The Address( ) function for a AccountRef just casts the AccountRef to an common.Address which in our case would be Contract A’s address for evm.DelegateCall & Contract B’s address for evm.Call. This means the SLOAD opcode we looked at in [ 8 ] is looking at Contract A’s storage for the DELEGATECALL opcode and Contract B’s storage for the CALL opcode.
Seeing the Geth implementation shows you how the storage, msg.sender & msg.value are altered for a DelegateCall. You should now have a comprehensive understanding of the DELEGATECALL opcode.
You made it congrats! Until next time.
noxx
Twitter @noxx3xxon