The Bitcoin Mempool: What Is It For?

Bitcoin Magazine

The Bitcoin Mempool: What Is It For?

Everyone who has used bitcoin has made use of the mempool, or a mempool. So what is the mempool?

Well technically, there is no such thing as “the” mempool. Every individual full Bitcoin node operates its own mempool, a cache of valid bitcoin transactions that have been broadcast to the network but have yet to be confirmed in a block. Nodes exchange messages with each other to see what transactions they have or not, and exchange ones they don’t have.

Each mempool is its own independent island essentially, with its own set of unconfirmed transactions, and sometimes its own configuration variables and settings. There is a size value to configure, set to 300 MB by default. In addition to this there is a minimum feerate that dynamically adjusts itself, and can have a configured value. This is used to decide which transactions to kick out of your mempool when it gets full and more transactions keep coming. There are a few other configurable options, such as the datacarrier and datacarriersize options affecting transactions containing OP_RETURN outputs.

Different nodes have different reasons for running a mempool, and therefore different needs, but it is ultimately through everyone in synchrony running their own mempools interacting with each other that those individual needs are met.

Think of each mempool as a literal pool, all connected to each other by channels in the ground. The larger a mempool is the deeper the pool in the ground is. Miners, exchanges, block explorers, these are all going to be the deepest pools. They all have different reasons motivating them to want to know of every unconfirmed transaction that is waiting to get into a block. Miners, to be sure they have the most profitable transactions for their next block. Exchanges, to be sure they are aware of all pending transactions. Block explorers, because their entire service is displaying as complete a dataset about the blockchain and mempool as possible. Your average nodes only really need to be deep enough to contain the top feerate slice of the “mempool.”

Now think of each transaction as a drop of liquid, the higher the feerate, the denser the drop of liquid. These drops flow in the channels between the pools, and upon arriving at each pool, a drop received is duplicated and then sent on through the channels to any other pool that hasn’t gotten that drop already. As pools fill up, upon overflowing the less dense liquids (lower feerates) will spill over the edge and out of the pool first.

Eventually some lucky miner gets to scoop a size restricted amount of liquid out of the bottom of its pool, and dump that into the newest glass tank in a long snaking line of glass tanks being filled with liquid to sit there forever (the blockchain). This is just a way to think about the system intuitively and encompass most of its dynamics.

This arrangement of pools interlinking serves different purposes for different users.

Transactors

Users making transactions have two uses for the mempool. First and foremost, is to get their transactions to the miners. If they don’t get to a miners mempool, then there is no possible way for them to wind up in a block. Mempools interlinking and sharing transactions with each other guarantees that eventually, once a transaction is put into one mempool, it will wind up in the mempools of all of the miners. Having a robust and decentralized network to guarantee that transactions will eventually get from a user to all the miners regardless of changing and fragmented connections on the network is a valuable thing.

The second use is fee estimation, which is especially important for Layer 2 users who could at any time have to ensure a response transaction to an invalid state is confirmed in a timely manner. It is possible to get some degree of fee estimation just looking at the feerate of transactions in those blocks, but that does not tell you anything about the current state of the mempool after the most recent block. It doesn’t account for sudden spikes, or opportunistic actors flooding the mempool, or the next wave of a growing transaction spike that hasn’t finished yet. Without a view of the mempool, fee estimation cannot be sure it is taking into account the current state of pending transactions.

Receivers

When you receive bitcoin, your node verifies that transaction as well as the entire block containing it. The transaction paying you is broadcast, winds up in a miner’s mempool, they find a block, that block is broadcast to the network, and then your node downloads and verifies it.

Except that’s not how that actually works (unless you disable your node’s mempool and run in blocksonly mode). Your node validates each transaction when it is first received in its mempool and caches that as a valid bitcoin transaction. When a miner finds a block, they actually only relay the blockheader and a small piece of compressed information, for lack of a better simple explanation, that can be used to figure out which transactions are in a block. Your node then grabs the pre-validated transactions, verifies the header, and if it all passes forwards the “compact block” onwards.

This optimization is actually why miners no longer depend on centralized and permissioned relay networks like FIBRE, formerly maintained by Matt Corrallo, and the short lived Falcon Network, which used to be necessary for miners to connect to in order to guarantee low block relay latency to other miners due to the poor relay speed across the peer-to-peer network.

Miners

Miners obviously want to see everything. They are profit driven entities that want to be able to select from the largest set of pending transactions possible the ones that include the highest paying fee. This is how they maximize profit and earn revenue to continue expanding their operation and remain competitive.

They literally get money out of the mempool. Their incentive to acquire any valid fee paying transaction is so strong that they have, historically, presently, and almost certainly in the future, built numerous systems, and even informal arrangements available socially, designed to allow users to directly submit transactions to the miners rather than through the open peer-to-peer network.

Block Explorers, Chain Analytics, Etc.

They, like miners, want to see every pending transaction that has been created and broadcast to the world. The major difference between the groups is miners directly monetize these transactions collecting fees, blockchain explorers and analytics companies indirectly monetize those transactions by displaying, analyzing, and providing that analysis of the information in a product that is monetized.

I can’t point to any concrete examples involving cached mempool data, but chain analytics companies have been known to regularly buy privately acquired metadata regarding transaction activity on-chain. They have also been known to operate sybil Bitcoin nodes that peer as widely as possible with nodes across the entire network to be able to narrow down which set of nodes originally broadcast a transaction.

Block explorers as well monetize visual displays of blockchain and mempool data, their entire business model is focused around that. Access to more data to display to their users is more information to potentially monetize if useful or novel ways to display that information or information derived from it.

Information Wants To Flow

All of these different classes of users benefit from there being “a” public mempool because of one simple dynamic: information flows freely across them. As long as there is a sufficient fee to get past minimum relay filters, it is consensus valid, and does not present a legitimate denial of service or resource exhaustion risk to individual nodes, it provides value for every class of user in propagating across each individual mempool in the network.

Without a functional public mempool, the only viable alternatives to all of these different uses for individual users is centralized solutions or an unmanageable chaos of slapdash and disorganized attempts at fragmented public mempools that each user will need to individually track.

That not only introduces the potential for manipulation of feerate data, deceiving users, and Miner Extractable Value concerns caused by private relaying of transactions. Without a healthy and open public mempool, these are the types of issues that Bitcoin will have to confront.

In a follow up article I’ll be looking at these issues, as well as different types of mempool filters and why they exist.

This post The Bitcoin Mempool: What Is It For? first appeared on Bitcoin Magazine and is written by Shinobi.