MEV Memoirs: Into the Arena - Chapter 1, Part 1 🤖
From Nothing to Something - How MevAlphaLeak entered the MEV arena
It’s April 2021, MEV Searchers roam the chain looking for their fortunes, snipers are everywhere, the PvP battles are heating up.
A new entrant is about to enter the fray.
On block 12342861 a Searcher appeared from the shadows. He came with nothing and took his first shot.
This is the story of how MevAlphaLeak entered the MEV arena.
Through The Eyes of a Searcher
The exploration of this story and subsequent deep dive gives us a chance to see the world through the Searcher’s eyes at that point in time.
Our task is the same as a mechanic who is taking apart a car to see how it works. Through the exercise of reverse-engineering the parts, the mechanic can begin to understand the thought process of the engineer/designer.
This is far more valuable than simply copying a strategy. As you collect these searcher perspectives you can then build & innovate on the ideas that have come before you.
What Happened?
We’ll start with a high-level overview of what happened in block 12342861 and then zoom in on a specific part of the deployed contract for the deep dive.
MevAlphaLeak’s first transactions are located at the top of block 12342861, let’s start there for the investigation.
This section shows the first 2 transactions from block 12342861 on Etherscan. The first creates an ApeBot Contract and the second interacts with that contract. We can see that mevalphaleak.eth is the EOA that initiated the transactions. Note mevalphaleak.eth was a brand new address with 0 ETH in the account.
Shows the created ApeBot Contract on Etherscan. The contract address is 0x666f80a198412bCb987c430831B57AD61facB666. Note that the contract source code has been verified so we can see the solidity code. Typically searchers don’t verify their contracts to obfuscate what is going on.
The second transaction in the block interacts with the ApeBot contract. Note that the “Interacted with (To)” section has the ApeBot contract address 0x666f80a198412bCb987c430831B57AD61facB666 as its entry point.
By looking at tokens transferred we get a rough idea of what happened. A flashloan was taken from dYdX and then used in an arbitrage opportunity between EtherDelta 2 & Uniswap V2 for the CST token. We can see the flashloan is paid back at the end.
There are two transfers at the end of the transaction that are interesting. The first to “Spark Pool” (0x5A0b54D5dc17e0AadC383d2db43B0a0D3E029c4c), a big Ethereum Mining Entity for a value of ~ 0.4497 ETH. The second to 0x90102a92e8E40561f88be66611E5437FEb339e79, which corresponds to the ENS mevalphaleak.eth for ~0.0004 ETH. You’ll soon find out the relevance of these transactions.
The Gas Price of the transaction which is set to zero (The contract creation transaction also had a gas price of zero). At this point you should be asking why any miner would accept a zero gas transaction.
Why a miner would accept those transactions relates to the transfer of ~ 0.4497 ETH in [ 5 ] to Spark Pool. Let’s take a look at who was the miner for block 12342861.
We can see that block 12342862 was mined by “Spark Pool”. The same “Spark Pool” that was paid out in the second mevalphaleak.eth transaction. What is going on that allows a 0 ETH account to make multiple transactions at 0 zero gas?
The secret is that these 2 transactions didn’t go the traditional route of into mempool to be picked up by a miner. Instead, they went via the flashbots relay.
Flashbots Relay
I won’t go too much into the details of the Flashbots Relay however will provide a brief synopsis for the purpose of this article.
When searchers are executing MEV strategies they often need to execute multiple transactions one after the other and they usually want to be at the top of the block.
Their search is based on the state of the world at block n-1 where n is the block they want to be included in. An Ethereum block can have hundreds of transactions each of which changes the world state.
Imagine “Searcher X” & “Searcher Y” seeing the same arbitrage opportunity and both submitting arbitrage transactions. When the transactions of a block are executed they are executed procedurally (in order).
If Searcher X’s transaction is above Searcher Y’s then when the node gets to computing Searcher Y’s transaction the Ethereum state will have changed. The arbitrage opportunity will no longer be there and their transaction with either revert or be unprofitable.
This is why it’s so important to have control over the order of transactions in a block.
As I’m sure you know transaction ordering in a block is an essential component of MEV. Before the Flashbots relay, this was typically done via PGA’s (Priority Gas Auctions) or by being a miner.
Flashbots saw this problem and looked to create a solution. They altered the Geth codebase to create mev-geth an Ethereum client that enabled searchers to submit bundles of transactions to the miners via the Flashbots Relay.
The Miner would look at which proposal gave them the biggest reward and place this bundle at the top of the block. This reward could be paid out in gas or with a “coinbase.transfer” which sends the specified amount of ETH to the miner of the block.
Unlike the mempool the relay is private to everyone except Flashbots. You are able to sign mutliple transactions/bundles and you won’t pay any gas unless they are executed.
A cool feature of the Flashbots Relay is their API which shows us for each block whether any bundles were included via the relay.
Let’s take a look at block 12342861.
We can see that MevAlphaLeaks transactions went via the Flashbots Relay which explains why he was able to use a 0 ETH address and the coinbase.transfer to Spark Pool.
Ok, let’s do a quick review of what we’ve learned so far before the deep dive.
MevAlphaLeak submitted a bundle of transactions via the flashbots relay
The first transaction created the ApeBot contract
The second transaction interacted with the ApeBot contract to execute an arbitrage
The arbitrage did the following (Note I have ommitted the WETH / ETH conversions that take place throughout the arbitrage)
Flashloan From dYdX of WETH
Exchange WETH for CST on EtherDelta 2
Exchange CST for ETH on Uniswap V2
Pay back WETH from dYdX Flashloan
At the end of the second transaction, MevAlphaLeak pays the miner via the conbase.transfer method 0.449293093766000097 ETH
MevAlphaLeak leaves with a contract deployed for free and 0.000449742836602602 ETH
Now let’s take a deep dive into the ApeBot contract.
A warning, the next section is heavy on the code side and assumes a fair amount of EVM knowledge from the reader.
ApeBot Contract
Now we will investigate the ApeBot Contract. We’ll briefly touch on the entry point of the contract and then zoom in on the block of assembly code that executes the arbitrage.
We can start with a high-level overview of the contract flow and identify where the assembly code we’d like to inspect is located. Note all the code snippets come from the verified contract on Etherscan so the line numbers should match up with that.
We start by calling wfjizxua(uint256,uint256[]). If you’re wondering what is with the strange name try hashing this to get its function signature. It returns the signature 0x00000000 which is more gas efficient than having a function signature that is non zero (A zero byte in calldata costs 4 gas whereas a non-zero byte costs 16 gas). We pass in actionFlags & actionData to the function.
This data (actionFlags & actionData) is in turn passed to ape(uint256,uint256[]). This is a permissionless function that allows a user to execute any arbitrary logic. This means you can have a single contract and target various DEX’s based on your call data rather than writing custom contracts for each. You’ll see how this function is generalised later in the article.
The “data” variable in the ape function which corresponds to the “actionData” from the original wfjizxua call is looped over. Some assembly code is executed on each loop. Let’s see what happens.
First, we need to look at the calldata for the ApeBot contract call.
How are we to decipher what this data means? We can start with the function signature that tells us a uint256 & unit256[ ] is passed in.
A unit256[ ] is a dynamic type. The encoding for dynamic types is different from static types. For static types like “uint256 actionFlags” we pass the values directly into the calldata. We can see this value at calldata item [0] in the image above.
For dynamic types like “unit256[] actionData”, we instead pass in the offset in bytes to the start of their data area, measured from the start of the value encoding. In our case, this can be seen at calldata item [1] and has a value 0x40. This is equal to 64 in decimal and tells us the data area for the array starts 64 bytes into our calldata.
An offset of 64 bytes means calldata item [2] signals the start of the data area. The first item in this data area [2] declares the length of the array, 0x25 = 37 decimal.
After this, we have 37 items from [3] to [39]. These items are the actual values in the uint256[ ] array.
These groupings are useful but we still have no idea how this calldata is used. To determine this we need to look at the assembly code and work backward from there.
I’ve highlighted in green the first item in the actual array, calldata [3], it will be apparent why when we look into the assembly code.
Assembly Code
What better place to start than the first line of the assembly code. The image below highlights this (line 219) along with some excerpts from the foundry debugger.
At the beginning of the assembly, a variable callInfo is created by running an MLOAD on memory location 0xa0. The data at this memory location, 100000000c78789120020674d380f7e1dc7408aa007744ed3af390f8a47f9b75, is the same as the calldata [3] item highlighted in green in the previous section.
Let’s now look at where this callInfo variable is used in the rest of the code.
Above is the full “Assembly code”. The highlighted boxes in the assembly code show where callInfo is being used. We can see that there are a number of bitmasks being used on callInfo to extract values and execute logic based on what is returned. If you’re not aware of how bitmasks work check out Part 3 of the EVM Deep Dives which looks at how this technique is used in storage slot packing.
By looking at the assembly we can determine that this value, 100000000c78789120020674d380f7e1dc7408aa007744ed3af390f8a47f9b75, is actually a custom encoded set of 32 bytes.
Below is the decoding of it. Note the numbers ( [1] etc.) in the assembly code image above correspond to the numbers in the image below so you can cross-reference.
callLength - encoded in bytes 4 - 5, the number of items to ingest from the calldata for the subsequent delegatecall / call
0x000c = 12 decimal
Function signature - encoded in bytes 6 - 9, the function signature for the subsequent delegatecall / call
0x78789120, we can look this up in the Ethereum Signature Database. It returns nothing, this indicates the associated contract may be unverified
callContract - encoded in bytes 13 - 32, the contract address for the subsequent delegatecall / call
0xd380f7e1dc7408aa007744ed3af390f8a47f9b75, we can look at this address on Etherscan. The contract hasn’t been verified so we’ll have to work with the contract bytecode.
Delegate Call or Call - encoded in byte 1, whether the subsequent call should be a DELEGATECALL or a CALL
0x10, execute as a DELEGATECALL
Gas Amount for Call - encoded in bytes 10 - 12, how much gas is allocated for the subsequent delegatecall / call
0x020674 = 132724 decimal
Should the returned data be used in the next call - encoded in byte 3, whether the returned data is needed for the next loop
0x00, the return data is not needed
To reinforce our understanding of how these encoded values are used we’ll quickly run through the logic that uses them and determines the execution path.
This section of the code checks for a non-zero function signature. Note how lines 226 & 234 differ. Both update the callLength but when the function signature is non-zero an extra 4 is added to the callLength. This is for the 4-byte function signature. The same applies to lines 228 & 236, both are for loops that loop over the data but the for loop on line 236 starts at 4 while the loop on 228 starts at 0. Again this is to account for the 4-byte function signature.
This section deals with the callLength. When you make a DELEGATECALL or CALL you have to specify where in memory the calldata/input for that call is located. This for loop is storing the calldata items in memory in anticipation of that call. It loads in the number of items from the calldata specified in the encoded data. If you’re unsure of how the DELEGATECALL opcode works take a look at Part 5 of the EVM Deep Dives which does an in-depth review.
This section looks at the DELEGATECALL/CALL byte, if the byte is 0x10 it executes a delegatecall on the “callContract” and for anything else, a call is made.
This looks at byte 3 and determines whether the returned data should be used in the next loop. For it to be used in the next loop it needs to be stored somewhere. If byte 3 is non-zero an MSTORE is used to save it in memory so it can be accessed later.
Now that we understand callInfo’s encoding and where it’s located in the calldata we can revisit the full calldata for the arbitrage transaction.
We’re going to organise the calldata into sections based on any encoded callInfos we can find.
We’re also going to link these sections to the parity traces of the transaction to determine which calldata is used for which call.
Parity traces enable us to see all the internal contract calls for a given transaction. This is extremely useful when debugging & attempting to piece together what happened on-chain.
The traces have a traceAddress. TraceAddresses have the notation [ 2, 1, 0, 3, 0, 2 ] where the number of items in the array shows you the depth of the call and the value of each item shows you the index of that call at that depth (ie its order position relative to other calls at that depth).
Here’s a quick summary of the notation from the OpenEthereum (Parity) website.
Now let’s dig into the calldata and the associated parity traces. There’s a lot going on in the image below but I promise it’s not too complicated.
The 3 sections in the parity traces above represent the 3 calls that execute the arbitrage between EtherDelta 2 and Uniswap V2.
The calldata has been grouped to show you what calldata was passed to the top call in each of parity trace groupings ([ 2, 1, 0, 3, 0, 2 ], [ 2, 1, 0, 3, 0, 3 ] & [ 2, 1, 0, 3, 0, 4 ]).
At the top of each calldata section is the custom encoded “callInfo” value, see calldata items [3], [17] & [31]. The groupings were determined by looking at the callInfo data along with the assembly code.
The parity traces show us that multiple subsequent calls are being made in each section. Two in the first call, five in the second call and ten in the third call.
Let’s have a closer look at the 3 core calls.
The first call ([ 2, 1, 0, 3, 0, 2 ]) uses calldata item [3] as its encoded callInfo. We looked at this exact callInfo earlier. The encoding tells us to make a DELEGATECALL to address 0xd380f7e1dc7408aa007744ed3af390f8a47f9b75 and take the next 12 items from the calldata as input.
Note the first item after the callInfo in the calldata ([4]) is 0x000…0. This represents the Wei amount for the call. You won’t see it being used in the delegatecall flow of the assembly since the DELEGATECALL opcode doesn’t take in a value. You can however see it being accessed in the call flow on line 260 of the assembly code. Note this “Wei value” is not included in the “next 12 items” from the calldata. Therefore the “next 12 items” runs from [5] to [16]. This applies to all 3 core calls.
If we inspect the parity traces of the transaction on Etherscan, specifically, Action [17] TraceAddress [ 2, 1, 0, 3, 0, 2 ] we can see that the values encoded in the callInfo match what is in the trace.
The “To” address = 0xd380f7e1dc7408aa007744ed3af390f8a47f9b75
CallType = delegatecall
Gas = 132724 (Allocated Gas in decimal)
Input Data matches the 12 calldata items highlighted with a prefix of the encoded function signature
The second call ([ 2, 1, 0, 3, 0, 3 ]) uses calldata item [17] as its callInfo. It makes a DELEGATECALL to the same address 0xd380f7e1dc7408aa007744ed3af390f8a47f9b75 but with a different function signature 0x401687f4. Again the next 12 items from the calldata are used as input along with the function signature.
The third call ([ 2, 1, 0, 3, 0, 4 ]) uses calldata item [31] as its callInfo. It makes a DELEGATECALL to a different address 0xf4863028b093fdac9cf7fd67c0df6866ac3c7a60 with the function signature 0x0fd72adb. The encoded callLength is 7 so the next 7 items from the calldata are used as the input along with function signature.
These 3 calls represent 3 loops of our assembly code. The first 2 calls handle the EtherDelta leg of the arb while the third executes the Uniswap section.
The 2 contracts the calls interact with 0xd380f7e1dc7408aa007744ed3af390f8a47f9b75 & 0xf4863028b093fdac9cf7fd67c0df6866ac3c7a60 are both unverified contracts.
This means we’re going to need to decompile their bytecode and debug via foundry at the opcode level to understand what is going on.
But that one is for next time.
Hope you enjoyed!
noxx
Twitter @noxx3xxon
This post is epic. Many thanks to Noxx!! I leaned a lot.
Really enjoy this article! Thanks a ton!