Polygon's zkEVM has the potential to revolutionize Ethereum scaling and bring increased adoption of zero knowledge technology. However, as with any new complex system, thorough security validation is required before full production launch. This was the goal of a comprehensive audit conducted by Hexens on the core components of zkEVM, including the smart contracts, prover and ROM.
For the uninitiated, zkEVM is Polygon's novel solution for scaling Ethereum using zero knowledge proofs while still maintaining compatibility with the EVM execution layer. This allows existing Ethereum smart contracts, developer tools, and wallets to function seamlessly while benefiting from the speed, cost and privacy advantages of zk-SNARKs.
The audited codebase mostly included the zkProver which consists of following components:
With such a complex system, the potential attack surface is significant, requiring expertise across smart contract security, zero knowledge proofs, and EVM specifics. Hexens brought these skills to the table in the audit, the results of which serve as an informative case study for the due diligence required when building novel blockchain systems.
One of the most serious issues uncovered was an ERC777 token reentrancy vulnerability in the zkEVM bridge contract responsible for asset deposits and withdrawals. For background, ERC777 tokens enhance the standard ERC20 implementation by introducing new callbacks that can be triggered on token transfers.
Hexens discovered that the bridge contract did not properly account for these callbacks when handling ERC777 deposits, opening the door to potential reentrancy attacks. Specifically, the ERC777 "tokensToSend" callback can be triggered before the bridge contract updates its own balance.
By recursively calling back into the deposit function, an attacker could artificially inflate the deposit amounts. For example, with 3 levels of reentrancy depositing 1 token could result in a deposited amount of 3 tokens.
This recursion can continue infinitely with no limit (well, up until the gas limit) on the fake deposits being created. When withdrawals are later enabled, this could allow the attacker to instantly drain much more from the bridge than they rightfully deposited.
The report notes that this attack requires only a single transaction, meaning funds could be jeopardized immediately.
Another critical finding involved incorrect handling of CTX, the virtual address spaces used by zkEVM to manage contract call contexts. Specifically, the security engineers found issues in the identity precompile contract's context switching logic.
The identity contract is basically the echo precompile contract. If called directly, it should effectively only charge intrinsic gas costs. However, a bug was identified where the context was incorrectly set to the global state instead of the specific call context.
Due to the overlapping variable offsets between the global and call context namespaces, this resulted in mixups when loading state. Most alarmingly, the transaction gas limit variable which should be quite small was instead loaded with the global state root hash which is almost guaranteed to have a very big absolute value.
With the huge state root value and some additional manipulation, this could allow the sequencer to massively inflate any caller’s (or their own) ether balance in the verified state. Given the sequencer's special status, this is a serious vulnerability that could lead to loss of bridge funds.
One of the most interesting attack vectors involved missing constraints in the PIL state machines used to generate zkSNARK circuits for verifying zkEVM proofs. Specifically, the audit revealed insufficient constraints for jump instructions.
This opened the door to attackers being able to craft malicious proofs that could hijack execution flow and redirect to arbitrary code segments. One example implementation could allow attackers to artificially credit themselves with huge ether balances by jumping to a specific gadget-chain ending with balance-updating logic.
The root cause lies in the lack of constraints to validate the special selector of jump opcodes used throughout the zkEVM PIL codebase. Without proper validation, attackers can supply crafted inputs that lead execution to unintended code areas.
Introducing additional constraints on this selector removes the attack vector.
In addition to the jump instruction issues, the audit also revealed missing constraints around free input handling in the PIL state machines.
Free inputs are values that are supplied to the prover either as user supplied data or to avoid complex calculations.
Introducing tightened constraints around the allowable range for free inputs is crucial. However, the finding demonstrates the difficulty in identifying and covering all edge cases.
Another critical issue stems from a missing constraint in the PIL code that handles key inclusion checks when retrieving data from the SMT. The SMT uses a tree structure where key paths are constructed using the least significant bits of the key to traverse down to the relevant leaf node.
To verify correct key-value binding, the PIL code reconstructs the key path by prepending successive key bits to a remaining key value (rkey). However, the polynomial representing the next key bit lacks a binary constraint to restrict it to 0 or 1 values.
By manipulating this next key bit value, an attacker could traverse an incorrect path down the tree and fake the binding between a key of their choosing and a leaf value. If the prerequisites are met to modify all rkey registers, the root check for proof of value inclusion can also be bypassed.
To exploit this, the attacker needs to insert a leaf at a suitable depth with 1111 as the least significant bits of the key. This allows modifying all rkey registers when traversing back up the tree later. The main impact would be faking the balance value bound to an attacker's account address.
This vulnerability exposed the integrity of the SMT data structure to potential corruption. Adding the missing constraint and modifying the ROM to validate next bit values resolves the issue.
The described vulnerabilities are all critical severity. This means they could lead to loss of funds, broken consensus, or denial of service, etc.
The implications are stark. Reentrancy could instantly drain bridge assets. Incorrect CTX handling enables sequencer to credit huge amount of ether to any address. Insufficient PIL constraints allows to bypass the soundness of the system and prove incorrect statements.
While Polygon addressed the findings, this highlights the risks inherent in pioneering blockchain innovations. Security is essential with novel systems.
In addition to the critical vulnerabilities, the audit revealed other concerns that could disrupt system operations:
While rated lower severity individually, these findings still warrant priority. They erode consistency with EVM, which is critical for smooth cross-compatibility. Even if not directly exploitable, these issues may undermine core functions and require diligent attention.
The range of issues uncovered by this audit highlights the tremendous ongoing effort required to build and maintain complex blockchain solutions like zkEVM. Being the first to implement a technology like zero-knowledge rollups with EVM compatibility brings an array of challenges.
The findings run the gamut from subtle compiler bugs to convoluted exploit chains spanning contract logic. Identifying and resolving these low level intricacies is non-trivial. It is a testament to the skill and dedication of the Polygon zkEVM team that all discovered issues have been addressed responsibly.
However, continued internal reviews, external audits, and bug bounties are recommended to validate zkEVM as it moves to production. Ongoing maintenance and performance tuning will be required as usage increases and new edge cases emerge.
The process of rigorously battle testing and incrementally strengthening new blockchain innovations is challenging but rewarding. Polygon is advancing the state of zero-knowledge technology and providing valuable research for the community. But as this audit clearly demonstrates, realizing the full vision of zkEVM will demand extensive continued effort.
Hexens' audit of zkEVM was invaluable, revealing vulnerabilities spanning from consensus risks to subtle compatibility gaps. Polygon’s swift resolution demonstrates their commitment to security.
However, this is just one step towards hardening zkEVM for production. Expanded testing and incremental deployments should continue as usage grows. Securing novel systems is challenging but essential work.
The path forward demands extensive ongoing diligence against emergent threats. With coordinated effort, zkEVM can provide scalability and enable wider zero knowledge adoption. But pioneering new cryptographic frontiers requires substantial maintenance.
The full report detailing all the findings from Hexens' audit of the zkEVM codebase can be found on GitHub.