Flashbots: Kings of The Mempool 🤖
Inside The Mempool, The Evolution of MEV-Geth & A Peek into the Future With MEV Boost
So you’ve heard of the Mempool, Flashbots, Bundles & Searcher Auctions but do you truly understand what these things are and how they work.
Do you understand their inner mechanics?
This article looks to provide a deep dive into the Mempool through the lens of the Flashbots auction.
The reader will be taken on a journey through the development of MEV Geth, a custom Ethereum client that enabled Flashbots to alter the shape & structure of the Mempool.
MEV Geth along with the Flashbots relay provided the gateway between searchers & miners changing the MEV game forever.
Let’s start with the protagonist of this story “The Mempool” before the first official MEV Geth even existed.
The mempool (memory pool) is a smaller database of unconfirmed or pending transactions which every node keeps. When a transaction is confirmed by being included in a block, it is removed from the mempool.
The definition is a great place to get us started. Let’s cover a few more basics.
A node ie Geth (or MEV Geth) keeps a list of unconfirmed or pending transactions.
These transactions are communicated to the nodes through their peers or are received directly through their RPC endpoint.
Each node has its own Mempool, there is no such thing as a global Mempool.
This means nodes can/do have different Mempools depending on where they are geographically and the peers they are connected to.
The nodes limit on the number of transactions they keep in their Mempool this is how they ensure they don’t get overwhelmed.
The Mempool is represented in a Geth node via a TxPool struct which can be seen below.
Two items have been highlighted. The first item, “locals”, represents a set of addresses. If a transaction in the TxPool (Mempool) is from one of these addresses it will be considered a local transaction (You’ll see why this is important in the next section).
The second highlighted item is “pending” which represents all currently processable transactions in the TxPool.
You might also notice the “queue” field on line 242, this represents transactions that are in the TxPool but aren’t processable. An example would be a Tx with a nonce that is not in sequence, indicating there is another transaction from the same address which should be processed first.
When thinking about MEV the next question you need to ask is how do these transactions find their way from the TxPool into a block.
Below is the Geth code that executes this functionality. You’ll see that the transactions are actually split into localTxs and remoteTxs, ordered by their gas price / nonce and that localTxs get priority over remoteTxs.
The “commitNewWork” function does many things including populating the block with transactions from the TxPool
This section gets the transactions from the TxPool and splits them into local and remote transactions
On line 954 you can see the comment “Fill the block with all available pending transactions” after this line, we set the “pending” variable via w.eth.TxPool().Pending(true). This returns a key-value mapping of address to a transaction.
Lines 967 - 974 show us using the “locals” variable we highlighted earlier to split the transactions into local & remote. We are looping through the locals addresses and checking to see if any of the pending Txs are from a locals address. If it matches, those transactions are removed from remoteTxs and added to localsTxs. (Locals is a set of addresses a node can define in its config, they get priority over remote transactions and won’t be dropped by the node)
The end result is the pending transactions are split into localTxs which have come from “locals” addresses and remoteTxs which have come from everyone else.
This block represents the localTxs being sorted and committed into the block. Since the localTxs are processed first in the code they get priority over the remoteTxs.
The transactions are ordered by gas price and nonce via “NewTransactionsByPriceAndNonce” and then committed to the block via “commitTransactions”
“NewTransactionsByPriceAndNonce” takes a set of Txs orders them by gas price and ensures account nonces are not out of order. To do this it uses a heap data structure which we can see initialised on line 443.
The heads variable is an array of TxByPriceAndTime which under the hood is just an array of transactions. It implements the heap interface which enables us to use heap calls to sort our data. It represents the “Next transaction for each unique account (price heap)”. To understand more about heaps see here.
Transactions are ordered per account by gas price but nonce order is maintained. This means a high gas-priced transaction with a nonce that isn’t next may won’t be at the top of the block. The below printout of a Geth test case demonstrates this.
If multiple transactions have the same price, the one that is seen earlier
is prioritized to avoid network spam attacks aiming for a specific ordering (timestamp is captured here in the code).
We return the transaction set to be used in the “commitTransactions” function to commit the transactions to the block.
The exact same process happens with the remote transactions, they are sorted by gas price and nonce and committed to the block. The number of committed transactions is limited by the block gas limit. Remote transactions can be dropped by on the nodes when the number of Txs in the TxPool exceeds the nodes limit.
This takes us back to the times of “Priority Gas Auctions” (PGA’s) where searchers & bots would increase the gas price of their transactions to get to the top of the block.
Bots would continually bid against each other in the ~13.5 second block time to be first in the block and as a result first to an on-chain opportunity (ie execute a liquidation).
When you lost an auction you still had to pay the gas price for your highest bid since your transaction would still be executed but it would revert. Bots could lose a lot of money from this.
When bidding in PGA’s the bots & searchers had to follow these rules to increase the gas price for a given transaction.
Now we’ve covered vanilla Geth let’s take a look at what changed with the first official release of MEV Geth.
MEV-Geth v0.1 - 31 Mar 2021
For those who don’t know MEV Geth was a new custom client that enabled miners to receive MEV bundles.
“MEV Bundles” represented bundles of transactions that extracted value and needed to be executed atomically. All transactions within the bundle should be executed or none of them.
Flashbots provided a relay connecting Searchers with Miners. Searchers would submit the bundles to the Flashbots Relay and the relay would forward them to Miners. The system relied on a level of trust between the parties to function.
MEV Geth and the Flashbots Relay looked to take PGA’s off the chain where they were impacting everyday users and move it to the relay.
The new auction rules were determined and made available to the Searchers. Losing bids wouldn’t be executed, saving bots their money and Ethereum its block space.
Bundles also introduced the ability to pay a miner via a coinbase.transfer( ) a direct payment to the miner rather than via gas fees.
For v0.1 only one bundle could be selected for inclusion per block.
Let’s take a look at the client alterations that were made to enable these MEV Bundles.
TxPool Geth’s implementation of the Mempool which we saw earlier.
A new field is added to the TxPool called “mevBundles” which is an array of type mevBundle. MevBundle is composed of 4 elements
“txs” - A list of transactions in the bundle. Each transaction is signed and RLP-encoded.
“blockNumber” - The exact block number at which the bundle can be executed
“minTimestamp” - Minimum block timestamp (inclusive) at which the bundle can be executed
“maxTimestamp” - Maximum block timestamp (inclusive) at which the bundle can be executed
“CommitNewWork” is the same function we looked at above which handles filling a block with all pending transactions.
The “Local Transactions” & “Remote Transactions” sections for ordering each group of Txs and committing them to the block. You’ll notice both are beneath the “Flashbots Bundles” section meaning the bundle’s transactions will have priority and be placed at the top of the block.
This section handles the “mevBundles” received from Flashbots. Line 1148 grabs all the bundles from the TxPool. The “findMostProfitableBundle” function (which we will look at in the next section) is used to determine the most profitable bundle. Bundle info is logged on the node and then the bundle is committed to the block via “commitBundle” on Line 1155.
So how did Flashbots determine the most profitable bundle?
Most Profitable Bundle
Earlier we said, “The new auction rules were determined and made available to the Searchers”.
Below are the new auction rules, it looks complicated but I can assure you it isn’t.
Let’s run through the diagram.
“s” is the “Adjusted Gas Price” which is what we what to maximise to win the auction.
Delta coinbase is the total ETH paid to the miner from the transactions in the bundle via direct payments ie coinbase.transfer( ).
The sum of all the bundle’s transaction gas payments, (gas price * gas used) per transaction, represents the total gas payment for the bundle.
The sum of all gas used in the bundle.
Dividing these 2 values gives you a ETH price per unit of gas for the bundle (Adjusted Gas Price)
To maximise this value we need to either
Increase the numerator by paying the miner more money through coinbase.transfer( ) or Gas. We do this by reducing our profit margin or finding more lucrative oppourtunities
Decrease the denominator by reducing the gas used in the bundle, we do this by gas optimising our contracts
Ok, let’s have a look at how this calculation is implemented in MEV Geth. We start where we left off in the previous example with the “findMostProfitableBundle” function.
The “findMostProfitableBundle” function loops over bundles to determine the “maxBundle” (most profitable) and returns this bundle along with some other data that is logged on the Geth node.
Each bundle is run through the “computeBundleGas” function. This function returns “totalEth” which represents the “coinbaseDiff” (How much has the miner’s balance increased by) and “totalGasUsed” which is the gas consumed by the transactions within the bundle.
We calculate the gas used by looping over the transactions in the bundle and running “ApplyTransaction” this allows us to simulate the transaction and get the transaction receipt which contains “GasUsed”. By summing the “GasUsed” for each transaction we can get the “totalGasUsed” of the whole bundle. The updated state is used for the next transaction in the loop so that it is representative of what happens in a real block.
Remember “totalGasUsed” represents the denominator (bottom) of our “Adjusted Gas Price” calculation which determines the winner of the Flashbots auction.
On line 1245 we declared a variable “coinbaseBalanceBefore” which grabs the balance of the coinbase (miner) address before any transactions are simulated. Line 1254 declares “coinbaseBalanceAfter” which grabs the coinbase balance after all the transactions have been applied to the state. Line 1255 sets “coinbaseDiff” which subtracts “coinbaseBalanceBefore” from “coinbaseBalanceAfter” to yield the miner profit. The value is used to set totalEth which is returned.
This value takes into account both coinbase.transfer( ) payments and payments in the form of gas which occurred during the “ApplyTransaction” function. This means “coinbaseDiff” represents the entire numerator (top) of our “Adjusted Gas Price” calculation.
The “Adjusted GasPrice” given we now have the numerator and denominator we just need to run the calculation. On line 1221 a variable “mevGasPrice” is declared, we divide “totalEth” by “totalGasUsed” which yields our “Adjusted GasPrice”.
Once the “Adjusted GasPrice” is calculated each bundle is compared to the current max value (Lines 1222 - 1227) this ensures the bundle with the highest “Adjusted GasPrice” (“mevGasPrice”) is returned.
This was the first version of MEV Geth but there was much more to come, over the next few months v0.2 was released. Let’s see what improvements were made.
MEV-Geth v0.2 - 12 Apr 2021 →16 Jun 2021
A number of improvements were made in v0.2 but we’ll focus on the main 3 changes. A change to the “Adjusted Gas Price” calculation, bundle merging and a tweak to the structure of a MevBundle within the client.
Adjusted Gas Price
The change to the calculation was a small one. It involved removing the gas payments from Txs that were in a bundle but were already in the TxPool. These were represented by
The purpose of this was to prevent “bundle stuffing” where searchers were stuffing their bundles with high gwei Txs from TxPool to inflate their “Adjusted Gas Price” score.
We revisit “computeBundleGas” to see what has changed with the calculation.
As before we loop through the Txs in the bundle running “ApplyTransaction” to simulate the Tx and see the change in the state
First we set “txInPendingPool” to false
Then we get all transactions pending in the TxPool from the address of the bundle Tx
Get the nonce of the bundle Tx
For each transaction in the TxPool associated with that address check if the nonces match
If they do set txInPendingPool to true and break out of the loop
If the transaction is not in the pending pool add the associated gasFees from that Tx to the gas fees for the entire bundle. (This in effect acts as the minus gas fees from Txs that were present in the TxPool)
Comparing the adjusted gas price calculation to the code we can now that
“ethSentToCoinbase” maps to “delta coinbase” - payments made to the coinbase address
“gasFees” maps the “total gas fees for the bundle” - this represents the gas fees for all Txs in the bundle minus the gas fees from Txs that were already in the mempool
In v0.1 we could only include 1 bundle per block, with bundle merging in v0.2 we could merge 3 bundles to maximise profit. Let’s see how.
“commitNewWork” which handles filling a block with transactions
As before we have our Flashbots bundle section, localTxs section and remoteTxs section. The “Flashbots Bundle” section has changed from v0.1, we have a new function called generateFlashbotsBundle.
“generateFlashbotsBundle” is taking in an array of bundles, simulating them, sorting them and then merging them. Let’s take a closer look.
We have a new struct called “simulatedBundle”, which contains the original bundle plus the key data required for our “Adjusted Gas Price” calculation, totalGasUsed, totalEth etc. The simulateBundles function on line 1217 takes a bundle and simulates it against the head of the chain to verify it doesn’t revert and computes the bundle gas.
The simulated bundles are sorted based on “mevGasPrice” (Adjusted Gas Price)
These sorted bundles are passed to the “mergeBundles” function. At the start of this function, a variable “finalBundle” is declared which represents a set of transactions. The “mergeBundles” function is designed to enable more than one bundle to be included in a block.
This section occurs inside of a bundle loop, each bundle is again simulated via “computeBundleGas” but instead of all being simulated against the chain head state they are instead simulated against the chain state that the previous bundle has left it in. If “computeBundleGas” returns an error or the “mevGasPrice” is below the floorGasPrice (this is a new hardcoded lower limit they introduced in v0.2 ) then the state/gasPool will be reset to what it was in the previous loop. The “continue” will put the program back to the start of the loop with the next item meaning the bundle won’t be included.
If the bundle passes the “computeBundleGas” check its Txs are added to the “finalBundle” and some key/values are updated ie totalGasUsed, totalEth etc. Finally, a counter is incremented, a default maximum of 3 bundles can be merged together. This “maxMergeBundles” value balances profit with node performance and can be updated in config.
This section checks the counter and breaks out of the for loop if the counter has reached the “maxMergeBundles”
A number of variables are returned including “finalBundle”, which is returned to generateFlashbotsBundle, inside the “commitNewWork” function on line 1146 the “bundleTxs” is set which corresponds to “finalBundle”. This value is then committed to the block on line 1155.
Reverting Tx Hashes
One other minor change was to the structure of the MevBundle as defined in the client code. An additional field, RevertingTxHashes, was added which represents a list of Tx hashes that are allowed to return status 0 (revert) on transaction receipts.
The idea of allowing reverting transactions seems strange. As a searcher you should have simulated the bundle and ensured it can execute at the top of the block so why do we need it.
This was a valid assumption when we had a single bundle being submitted in v0.1 but we now have bundle merging enabled on the node.
Suddenly you can’t be sure of the chain state when your bundle is included. You may want to flag some transactions ie liquidations as allowed to revert in case another bundle was included before you.
If your bundle can be profitable with a reverting transaction it makes sense to include it in the list. The alternative is the bundle is discarded by the node so the searcher has nothing to lose.
Strategies that required more than one transaction to execute such as sandwiching likely wouldn’t be allowed to revert since you may end up in a situation where one leg of the trade reverts but the other doesn’t which could leave you exposed.
Next, we’re going to jump straight to version v0.4 where Megabundles were introduced. For those interested, v0.3 involved making MEV Geth compatible with EIP-1559 you can read more about it here.
MEV-Geth v0.4 - 27 Sep 2021 →10 Jan 2022
If you look at the “mergeBundles” implementation in v0.2 you can see that it is quite naive and limiting in how it goes about merging the bundles.
The reason for this is that you don’t want to implement a change in the software (MEV Geth) that completely changes the characteristics of the required hardware (CPU MEM).
Ideally, you want bundle merging to be conducted over the maximum amount of bundles with optimised hardware and algorithms. It doesn’t make sense to implement this on every node.
Megabundles enabled this, they are the precursor to the “full block” bundles we’ll see in MEV Boost.
One interesting change that Megabundles introduces was the ability to rank bundles via profitability rather than gas price.
Imagine 2 bundles one that nets you a profit of 1 ETH with an “Adjusted Gas Price” of 100 and another that nets you 10 ETH with an “Adjusted Gas Price” of 80. These numbers are exaggerated but as a miner you want the 10 ETH bundle. You don’t know when you’ll next get a block so you want to maximise profit when it is your turn.
See this Bert Miller Tweet for some more context.
The value proposition was simple we’ll do the computation and find you a bundle that maximises profit.
Let’s see how it worked.
Back at the TxPool struct (Mempool) we can see we have a new field called Megabundles which is a mapping of addresses to “MevBundles”. The address represents the relay the MevBundle is coming from.
The sendMegabundleArgs struct is very similar to the MevBundle struct with one additional field RelaySignature. This is the signature from the relay for that specific Megabundle.
RecoverRelayAddress takes the RelaySignature and determines the address that made it.
The authorised relay addresses are maintained through the MinerTrustedRelaysFlag. The recovered address from the Megabundle signature will be cross-referenced with the trusted relay addresses to ensure only Megabundles from trusted relays are accepted.
Why do Megabundles have this additional security where the node checks that it is from a trusted relayer. Let’s look into the implementation of how a Megabundle is committed to the block to help us understand why.
Committing a Megabundle
The Megabundles are the result of a Relay merging multiple bundles together to maximise profit.
They are designed to take computation load off the MEV Geth node. They do this by removing the need for the node to merge and order bundles, for the client to not perform these calculations there needs to be a level of trust.
Let’s see which steps are skipped when a Megabundle is committed to a block.
“commitNewWork” which handles filling a block with transactions
There is an if statement checking whether the received bundle is a Megabundle, if it is, the Megabundle is grabbed from the TxPool and committed to the block. Above line 1220 in the codebase is the alternate path for when != Megabundle the node takes the vanilla bundles and merges them as we saw previously.
The Megabundle is committed to the block, note there is no “simulateBundles”, “mergeBundles” or “computeBundleGas”. This reduces the computation the node needs to complete.
“commitBundle” in turn calls “commitTransaction”
“commitTransaction” applies the transaction to the state, verifying it does not revert, updating the profit and returning the receipt logs.
We can see that the Megabundles implementation focuses on speed & performance.
In v0.5 the speed & performance was increased even further by processing a Megabundle immediately if it was better than the best-known block so far. You can dig into the code changes here.
In MEV Geth v0.6, the latest release of the client, private transactions were introduced. This enabled users to send transactions that would be bundled in with the remote transactions but wouldn’t be emitted to node peers for a period of time.
These private transactions aren’t sent to “the Mempool” where “the Mempool” means the TxPool on other Geth nodes.
Generalised frontrunners will snipe transactions they see in the Mempool that are profitable. While a Flashbot bundle protects against this since they aren’t sent to the Mempool they are recorded on the Flashbots API.
Searchers often look through these recorded transactions for alpha. The searcher may want to disguise their Tx by putting it in with other remote transactions.
As I mentioned earlier Megabundles were the precursor to full block builds and MEV Boost.
In ETH 2 the concept of a block builder has arisen. An entity dedicated to building optimised blocks.
In this new world Validators (Not Miners) will have the block ordering power and will be able to opt-in to receiving full blocks from builders to submit for their allocated slots.
These blocks will be blinded - the validator will not be able to see the transactions until the block has been emitted to the network. This prevents validators from stealing MEV for themselves reducing the required trust between parties.
MEV Boost will be a sidecar container connecting the Validator to Relays which in turn will connect to a network of “Block Builders”.
If you’d like to learn more here is a good place to start.
That’s all for today, hope you enjoyed it.
Follow me on Twitter @noxx3xxon
Thanks for reading noxx! Subscribe for free to receive new posts and support my work.