The Ethereum Proximity Prize: Reed–Solomon Codes and MCA

Zero Knowledge

Sep 8, 2025

proximity-gaps-image

The Ethereum Foundation has announced an ambitious cryptography bounty: up to 11 million for resolving the up-to-capacity proximity gaps conjecture for Reed–Solomon codes, specifically its strengthened variant known as mutual correlated agreement.

This might sound like an esoteric question buried deep in coding theory, but the stakes couldn't be higher. This mathematical conjecture forms the bedrock of modern zero-knowledge proofs, particularly in FRI and WHIR used in hash-based SNARKs. These proof systems are essential to Ethereum's scaling roadmap, enabling succinct and verifiable computation at massive scale.

The resolution of this conjecture will have profound implications for the entire blockchain ecosystem. If the conjecture is proven true, it would cement the security and efficiency assumptions already built into Ethereum's zk-rollup ecosystem, ensuring that current proof sizes and computational costs remain viable for years to come. However, if the conjecture is disproven, the consequences could be severe. It might invalidate current efficiency claims, forcing proof sizes to grow significantly, increasing verification times, and potentially requiring entire classes of zero-knowledge protocols to be completely redesigned.

This exploration takes you through the conjecture itself, examines our current understanding, explains its cryptographic significance, and reveals why Ethereum is willing to put serious money on the table to see this mathematical puzzle finally resolved.

Understanding Reed–Solomon Codes

What Makes a Linear Code

To understand why this conjecture matters, we need to start with the fundamentals of error-correcting codes. An [n,k,]q[n,k,\ell]_q linear code CC represents a sophisticated mathematical structure: it's a kk-dimensional linear subspace within the vector space Fqn\mathbb{F}_q^n, designed with a minimum distance of \ell between any two codewords.

Let's break down what each parameter means. The block length nn tells us the total length of each codeword after encoding. Think of this as how much data you actually transmit, including both the original message and the redundant information needed for error correction.

The dimension kk represents the number of original message symbols you want to encode. Since CC forms a kk-dimensional subspace over the finite field Fq\mathbb{F}_q, it contains exactly qkq^k unique codewords. The ratio ρ=k/n\rho = k/n gives us the rate of the code, which measures the efficiency of our encoding: how much of the transmitted data carries useful information versus redundancy.

The minimum distance \ell determines the error-correcting power of our code. To understand this, we first need to consider the Hamming distance between two codewords xx and yy in Fqn\mathbb{F}_q^n. This distance d(x,y)d(x,y) simply counts the number of coordinate positions where the two codewords differ. The absolute minimum distance of a code CC is then

dmin(C)=minxyCd(x,y),d_{\min}(C) = \min_{x \neq y \in C} d(x,y),

and we often work with the relative minimum distance

drel(C)=dmin(C)n.d_{\text{rel}}(C) = \frac{d_{\min}(C)}{n}.

For linear codes, we denote this minimum distance by \ell.

Finally, qq represents the size of our finite field, essentially determining our alphabet. Larger fields give us more flexibility but also increase computational complexity.

The Beauty of Reed–Solomon Codes

Reed–Solomon codes work through an elegant mathematical principle. We start with a collection D=(α1,,αn)D = (\alpha_1, \ldots, \alpha_n) of nn distinct points from our finite field Fq\mathbb{F}_q, where typically nqn \leq q to ensure we have enough distinct evaluation points.

The encoding process transforms a message into a polynomial. We take our message and represent it as coefficients of a polynomial P(X)P(X) of degree less than kk. Then we evaluate this polynomial at our nn chosen points, creating nn values. This evaluation process is the heart of Reed–Solomon encoding: we're essentially oversampling our polynomial by evaluating it at more points than strictly necessary to determine it uniquely.

This oversampling creates built-in redundancy. Even if some of these evaluation points get corrupted during transmission, we can still reconstruct the original polynomial (and hence recover our message) as long as we don't lose too many values.

A concrete RS example

  • Field: F11={0,,10}\mathbb{F}_{11}=\{0,\dots,10\} (arithmetic mod 1111)
  • RS code: length n=7n=7, dimension k=3k=3 ⇒ degree <3<3
  • Domain: D={0,1,2,3,4,5,6}D=\{0,1,2,3,4,5,6\}
  • Distance: dmin=nk+1=5d_{\min}=n-k+1=5
Encoding

Pick a message polynomial of degree <3<3, say

P(X)=3+4X+2X2F11[X].P(X)=3+4X+2X^2 \in \mathbb{F}_{11}[X].

Evaluate on DD:

x0123456P(x)3980770\begin{array}{c|ccccccc} x & 0 & 1 & 2 & 3 & 4 & 5 & 6\\ \hline P(x) & 3 & 9 & 8 & 0 & 7 & 7 & 0 \end{array}

This vector [3,9,8,0,7,7,0][3,9,8,0,7,7,0] is the codeword.

image

The Singleton Bound: A Fundamental Limit

In coding theory, there's a fundamental trade-off between how efficiently we can encode information and how many errors we can correct. This trade-off is captured by the Singleton bound, which states that for any [n,k,]q[n,k,\ell]_q code, the minimum distance must satisfy

nk+1.\ell \leq n-k+1.

We can understand this bound through a clever argument called puncturing. Imagine we have a code CC with minimum distance \ell, meaning any two distinct codewords differ in at least \ell positions. Now consider what happens if we delete the first 1\ell - 1 coordinates from every codeword, creating a puncturing map π\pi that takes nn-coordinate vectors and produces (n(1))(n-(\ell-1))-coordinate vectors.

The key insight is that this puncturing operation remains injective when restricted to our code. If two original codewords cc and cc' were different, they differed in at least \ell positions. After deleting only 1\ell - 1 coordinates, at least one differing position must remain, so π(c)π(c)\pi(c) \neq \pi(c'). This means the punctured code π(C)\pi(C) has the same number of codewords as the original code CC.

But the punctured codewords now have length n+1n-\ell+1, and there are only qn+1q^{\,n-\ell+1} possible vectors of this length over Fq\mathbb{F}_q. Since π(C)=C|\pi(C)| = |C|, we must have Cqn+1|C| \leq q^{\,n-\ell+1}. For a linear code with C=qk|C| = q^k, this gives us kn+1k \leq n-\ell+1, which rearranges to the Singleton bound:

nk+1.\ell \leq n-k+1.

The intuition behind this bound is elegant. Because any two codewords differ in at least \ell places, you can erase up to 1\ell - 1 coordinates and the codewords remain distinguishable. Only if you erase \ell or more coordinates could different codewords become indistinguishable. This captures the fundamental tension between information rate and error-correcting capability.

Why Reed–Solomon Codes Are Special

Reed–Solomon codes meet the Singleton bound with equality, giving

=nk+1.\ell = n-k+1.

This makes them Maximum Distance Separable (MDS) codes, representing the optimal balance between information rate and error correction capability. You literally cannot do better than Reed–Solomon codes in terms of this fundamental trade-off.

This optimality is why Reed–Solomon codes appear everywhere in practice, from CD players to satellite communications to modern cryptographic protocols. They squeeze the maximum error-correcting power out of every bit of redundancy, making them incredibly efficient for protecting data against corruption.

The Language of Proximity

When we receive a potentially corrupted word, we need to determine whether it could plausibly have originated from our code. This requires introducing precise terminology for measuring how "close" a received word is to our code.

For any fixed error fraction δ\delta, we can classify received words into two categories based on their proximity to the nearest codeword.

A received word ww is δ\delta-close to code CC when there exists at least one codeword uCu \in C satisfying

d(w,u)δn.d(w,u) \leq \delta n.

In this case, our list of candidates is non-empty, we have at least one plausible explanation for the received word.

Conversely, a received word ww is δ\delta-far from CC when every codeword uCu \in C satisfies

d(w,u)>δn.d(w,u) > \delta n.

Here, our candidate list is empty; no codeword lies within our acceptable error radius.

This terminology allows us to pose the essential geometric question cleanly: given a received word ww and error tolerance δ\delta, is ww close to our code or far from it?

Unique Decoding Radius and Capacity

Because the minimum distance between any two distinct codewords is \ell, their protective bubbles can have a radius of up to (but not including) /2\ell/2 before they risk touching or overlapping. This critical value defines the code’s error‑correction limit.

The maximum number of errors that a code is guaranteed to correct is denoted by tt and equals

t=12t = \left\lfloor \frac{\ell - 1}{2} \right\rfloor

Any received word with tt or fewer errors will always be closer to the original codeword than to any other. This tt is often called the unique decoding radius of the code.

Geometric intuition. Think of radius-/2\ell/2 Hamming balls around each codeword. These balls are disjoint because centers are at distance at least \ell. Any ww inside a ball is uniquely decoded to that ball’s center; points outside every ball are ambiguous.

Capacity note. Pushing error tolerance toward /2\ell/2 corresponds to approaching the Singleton bound =nk+1\ell = n-k+1: with larger \ell (more redundancy), the guaranteed unique-decoding radius increases; with higher rate ρ=k/n\rho=k/n, it decreases. This is the fundamental rate–distance trade‑off.

Embracing Ambiguity: A More Pragmatic Approach

Traditional unique decoding demands certainty, it insists on producing exactly one unambiguous answer. When faced with ambiguity, it simply fails, leaving us with no useful information. List decoding takes a more pragmatic stance by asking a fundamentally different question: instead of desperately seeking the one true codeword, why not provide a short list of plausible candidates?

This approach transforms our decoding problem into something more manageable. Given a received word ww, we aim to produce all codewords uCu \in C that lie within some prescribed distance. The beauty of this approach lies not just in its flexibility, but in its ability to remain useful even when perfect certainty is impossible.

The critical question that emerges is obvious yet profound: how large might this list of candidates become? If the list could grow arbitrarily large, our promise of providing a manageable set of options would become meaningless. This concern brings us to one of the most important theoretical results in coding theory: the Johnson bound.

The Johnson Bound: Mathematics Meets Practicality

The Johnson bound provides a universal safety net, guaranteeing that under certain conditions, our list of candidates will remain small and manageable. To understand this result, we need to think in terms of relative parameters, which give us a cleaner perspective on the fundamental trade-offs involved.

Consider the relative minimum distance

μ=/n,\mu = \ell/n,

which tells us about the inherent error-correcting capability of our code. We also define the relative decoding radius

δ=e/n,\delta = e/n,

representing the fraction of symbol positions where we're willing to tolerate errors.

The Johnson bound states something remarkable: for any code with relative minimum distance μ\mu, and for any received word ww, the number of codewords lying within relative distance δ\delta remains bounded by a polynomial in nn, provided that

δ<11μ.\delta < 1 - \sqrt{1 - \mu}.

This condition defines what we call the Johnson radius. Within this radius, our candidate list stays small and manageable, polynomial in size. What makes this result particularly powerful is its universality: the bound depends only on the fundamental parameters μ\mu and δ\delta, not on the specific structure of the code.

For Reed–Solomon codes over large fields, sophisticated algorithms like Guruswami–Sudan can achieve list decoding right up to this Johnson radius. Since Reed–Solomon codes have μ1ρ\mu \approx 1-\rho where ρ=k/n\rho = k/n is the rate, the Johnson radius becomes

δJ=1ρ.\delta_J = 1 - \sqrt{\rho}.

image

Three Worlds of Decoding

The landscape of decoding naturally divides into three distinct regimes, each characterized by the fraction of errors δ\delta we're trying to handle.

The Safe Zone of Unique Decoding occurs when

δ<μ/2.\delta < \mu/2.

Here, we remain in the comfortable world where at most one codeword can lie within our decoding radius of any received word. This is the traditional domain of classical error correction, where algorithms are simple and answers are unambiguous.

The Realm of List Decoding emerges when

μ/2δ<11μ.\mu/2 \leq \delta < 1 - \sqrt{1-\mu}.

We've left the safe zone of uniqueness behind, but we haven't entered chaos. The Johnson bound guarantees that our list of candidates remains small, and remarkably, efficient algorithms exist to find all candidates within this range. The Guruswami–Sudan algorithm for Reed–Solomon codes exemplifies how sophisticated mathematics can tame what initially appears to be an intractable problem.

The Combinatorial Wilderness begins when

δ11μ.\delta \geq 1 - \sqrt{1-\mu}.

Here, the number of potential candidates can explode exponentially, making decoding computationally infeasible for general codes. This regime represents the fundamental limits of what's possible with polynomial-time algorithms.

image

Proximity Testing and the Gap Phenomenon

Imagine you're a verifier trying to check whether a bunch of vectors are valid codewords. Checking each one individually? That's going to be expensive. So here's a clever trick that cryptographers love: instead of testing each vector separately, ask the prover to mix them all together into a single random linear combination:

u=r0u0+r1u1++rtut,riFq.u' = r_0 u_0 + r_1 u_1 + \cdots + r_t u_t, \qquad r_i \in \mathbb{F}_q.

Now you only need to test this one combined vector uu' to see if it's δ\delta-close to your Reed–Solomon code. Pretty neat optimization, right? But here's the million-dollar question: does this shortcut actually work? If even one of the original vectors uiu_i is garbage (meaning δ\delta-far from the code), will the random combination uu' still look bad with high probability?

The answer turns out to involve a beautiful mathematical phenomenon called proximity gaps.

The All-or-Nothing Nature of Structured Collections

In computer science, we love "gaps" – situations where things fall cleanly into one of two buckets with no messy middle ground. Think about classic examples like the PCP theorem (either your formula is satisfiable or it's really far from being satisfiable) or primality testing with Miller–Rabin.

Here, we're looking at the property "being a valid Reed–Solomon codeword," and we'll see this gap behavior emerge when we look at structured collections of inputs – specifically affine subspaces of Fqn\mathbb{F}_q^n.

Let's get concrete. Take a Reed–Solomon code V=RSq(D,k)FqnV = \mathrm{RS}_q(D,k) \subseteq \mathbb{F}_q^n built from polynomials of degree less than kk evaluated over some domain DD of size nn. Now consider a collection S\mathcal{S} of subsets of Fqn\mathbb{F}_q^n – typically affine subspaces like lines or planes.

We say S\mathcal{S} has a (δ,ε)(\delta,\varepsilon)-proximity gap with respect to VV if every subset SS in S\mathcal{S} behaves in one of two extreme ways:

  • All-In: Every single vector in SS is close to the code:
PruS[d(u,V)δn]=1\Pr_{u\in S}[d(u,V)\le \delta n] = 1
  • All-Out: Almost no vectors in SS are close:
PruS[d(u,V)δn]ε\Pr_{u\in S}[d(u,V)\le \delta n] \le \varepsilon

The smaller ε\varepsilon gets, the sharper this all-or-nothing behavior becomes.

What We Actually Know About Reed–Solomon Codes

Here's the main result that makes all this practical. For Reed–Solomon codes V=RSq(D,k)FqnV = RS_q(D,k) \subseteq \mathbb{F}_q^n with rate ρ=k/n\rho = k/n, we get proximity gaps for error fractions δ\delta up to the famous Johnson radius 1ρ1 - \sqrt{\rho}. The collection of all affine subspaces exhibits a (δ,ε)(\delta,\varepsilon)-proximity gap, where ε\varepsilon depends on which regime we're in:

  • Unique-decoding regime (when 0<δ<1ρ20 < \delta < \dfrac{1-\rho}{2}):
ε  =  εU(q,n)  =  nq.\varepsilon \;=\; \varepsilon_U(q,n) \;=\; \frac{n}{q}.
  • Johnson regime (when 1ρ2δ<1ρ\dfrac{1-\rho}{2} \le \delta < 1-\sqrt{\rho}):
    If we define the safety margin η:=(1ρ)δ>0\eta := (1-\sqrt{\rho})-\delta > 0, then:
ε    O ⁣(n2q(ηρ)O(1)).\varepsilon \;\le\; O\!\left(\frac{n^{2}}{\,q\,(\eta\,\rho)^{O(1)}}\right).

Why should you care? This completely justifies that random linear combination trick I mentioned earlier. Pick random coefficients riFqr_i \in \mathbb{F}_q and form u=iriuiu'=\sum_i r_i u_i. If any of your input vectors uiu_i is δ\delta-far from the code, then with probability at least 1ε1-\varepsilon, your combination uu' will also be δ\delta-far. One test catches everything!

A Few Technical Notes

  • About that δ\delta range: The maximum δ=1ρ\delta = 1-\sqrt{\rho} isn't arbitrary – it's exactly the Johnson/Guruswami–Sudan list-decoding radius. Some suspect the gap phenomenon might extend even further (maybe all the way to the capacity bound 1ρ1-\rho), but that's still conjecture.

  • Field size matters: For the Johnson regime to work cleanly, you need qq to be roughly quadratic in nn (think qn2q \gtrsim n^2). But for the unique-decoding regime, q=Θ(n)q = \Theta(n) is plenty.

  • These bounds are tight: Especially in the unique-decoding case, that ε=n/q\varepsilon = n/q bound is essentially as good as it gets. There really do exist affine lines that don't give you anything stronger.

Correlated Agreement: When Closeness Isn't Just Luck

Here's a fascinating insight: suppose you take two vectors and form lots of random linear combinations, and many of those combinations happen to look close to your Reed–Solomon code. This closeness can't just be due to lucky cancellations of errors – there must be some deep structure at play. Specifically, there must be a large set of coordinates where both original vectors already agree with some codewords from your code. This shared structure is what we call correlated agreement.

The Basic Setup

Let's work with our Reed–Solomon code V=RSq(D,k)FqnV = RS_q(D,k) \subseteq \mathbb{F}_q^n over evaluation domain DD. I'll think of codewords as functions on DD, so uFqDu \in \mathbb{F}_q^D. We'll use d(,)d(\cdot,\cdot) for Hamming distance and write d(u,V)d(u,V) for the distance from uu to the nearest codeword.

Fix your proximity parameter δ\delta and that error parameter ε\varepsilon from the proximity gap theorem we just discussed.

How It Works for Lines

Take two vectors u0,u1FqDu_0, u_1 \in \mathbb{F}_q^D and look at the affine line they generate:

A  =  {u0+zu1  :  zFq}.A \;=\; \{\, u_0 + z\cdot u_1 \;:\; z \in \mathbb{F}_q \,\}.

Now suppose a significant fraction of points on this line are δ\delta-close to the code:

PrzFq ⁣[d(u0+zu1,V)δn]  >  ε.\Pr_{z\in\mathbb{F}_q}\!\big[\, d(u_0 + z\,u_1,\, V) \le \delta n \,\big] \;>\; \varepsilon.

Then something remarkable happens: there must exist a large subdomain DDD' \subset D and actual codewords v0,v1Vv_0, v_1 \in V such that:

  • Most coordinates are good:
DD    1δ\frac{|D'|}{|D|} \;\ge\; 1 - \delta
  • Both vectors agree with code on these coordinates:
u0D=v0Dandu1D=v1Du_0|_{D'} = v_0|_{D'} \quad \text{and} \quad u_1|_{D'} = v_1|_{D'}

In plain English: if many combinations u0+zu1u_0 + z u_1 look close to your code, it's because u0u_0 and u1u_1 each agree with some codeword on a common large fraction of coordinates.

The General Picture

This extends naturally to higher-dimensional affine spaces. Take vectors u0,u1,,uFqDu_0, u_1, \ldots, u_\ell \in \mathbb{F}_q^D and consider:

U  =  u0+span{u1,,u}    FqD.U \;=\; u_0 + \mathrm{span}\{u_1,\ldots,u_\ell\} \;\subset\; \mathbb{F}_q^D.

If enough points in UU are close to the code (more than that ε\varepsilon fraction), then there's a large coordinate set DD' where all the generators u0,,uu_0, \ldots, u_\ell simultaneously agree with corresponding codewords v0,,vv_0, \ldots, v_\ell.

Why This Matters

Correlated agreement gives us a sufficient condition for those proximity gaps we want. Intuitively, if a batch of linear combinations often looks close to the code, that's only possible when all the underlying vectors share a large "agreement region" with the code.

The big open question? We don't know if correlated agreement is also necessary for proximity gaps. Maybe there are other ways to get gaps that work beyond the Johnson radius (for δ>1ρ\delta > 1-\sqrt{\rho}) without going through this agreement structure.

The Gap Theorem Falls Out

Once you have correlated agreement, the proximity gap theorem becomes almost obvious. Every line AA falls into one of two categories:

  • All-in: Every point on AA is δ\delta-close to the code (which happens exactly when AA agrees with some code-line on a large coordinate set).
  • All-out: At most an ε\varepsilon-fraction of points on AA are δ\delta-close.

Mutual Correlated Agreement: The Next Level

Here's where things get really interesting for modern cryptographic protocols. Systems like WHIR don't just check a single random linear combination per round. They create several different mixtures of the same base vectors – think folding operations, shifts, different combination patterns. To maintain tight soundness when you're batching all these checks together, you need something stronger than ordinary correlated agreement.

You need the same large coordinate set to witness agreement simultaneously across your entire family of mixture operations. This is Mutual Correlated Agreement (MCA).

The General Idea

Let M\mathcal{M} be your family of affine-linear mixture operations that your verifier uses in a given round. For each operation MMM \in \mathcal{M}, suppose you have:

Prθ[Δ(M(θ),V)δ]  εM\Pr_{\theta}\big[\Delta\big(M(\theta),V\big)\le \delta\big]\ \ge\ \varepsilon_M

for some reasonable measure over the mixing parameters θ\theta.

The mutual correlated agreement conjecture says there should exist a single coordinate set DDD' \subseteq D with D(1δ)n|D'| \ge (1-\delta)n and a consistent family of codewords such that simultaneously for all MMM \in \mathcal{M} and all parameters θ\theta:

M(θ)D  VDM(\theta)\big|_{D'}\ \in\ V\big|_{D'}

The key word is "mutual" – one common "good" coordinate set DD' that works for your entire mixture family, not separate sets for each operation.

image

Where We Stand

  • Proved for unique-decoding: The WHIR paper establishes MCA when δ<(1ρ)/2\delta < (1-\rho)/2, which enables their "no soundness degradation" result when batching multiple checks per round.

  • The big conjecture: Extending MCA to the Johnson regime (δ<1ρ\delta < 1-\sqrt{\rho}) or even further ("up to capacity" at δ<1ρ\delta < 1-\rho) is now one of the central open problems in the field.

  • Why it matters: A proof would give us the tightest possible soundness analysis for the fastest transparent Reed–Solomon-based protocols. A counterexample would force us to redesign parameters or protocols.

Conclusions

The Ethereum Foundation's million-dollar bounty represents more than just a mathematical challenge, it's a strategic investment in the cryptographic foundations of scalable blockchain infrastructure. The mutual correlated agreement conjecture for Reed-Solomon codes sits at the intersection of pure mathematics and practical cryptography, where theoretical breakthroughs directly translate into real-world impact.

Our exploration reveals why this particular conjecture matters so much. Reed-Solomon codes already provide the optimal trade-off between information rate and error correction (achieving the Singleton bound), making them the natural choice for cryptographic protocols. The proximity gap phenomenon allows efficient batch verification of multiple statements simultaneously, dramatically reducing the computational overhead of zero-knowledge proofs. But to push these efficiencies to their theoretical limits, especially in protocols like WHIR that perform complex mixture operations, we need the stronger guarantee provided by mutual correlated agreement.

The stakes are clear. A positive resolution would validate the security assumptions underlying Ethereum's scaling roadmap, confirming that current zk-rollup architectures can maintain their efficiency guarantees as they scale to global adoption. The mathematical proof would demonstrate that batched verification techniques can be pushed right up to the Johnson radius (or potentially even the capacity bound) without sacrificing soundness.

Conversely, a negative result would force a fundamental reassessment of how we design transparent zero-knowledge protocols. Proof sizes might need to grow, verification times could increase, and entire classes of optimizations might prove unsound. The cryptographic community would need to develop new approaches that work within whatever limitations a counterexample reveals.

Acknowledgments

This exploration builds upon the foundational work of countless researchers in coding theory, cryptography, and complexity theory. The theoretical foundations of proximity gaps for Reed-Solomon codes are comprehensively treated in "Proximity Gaps for Reed–Solomon Codes", which established many of the fundamental results discussed in our exploration of the Johnson radius and correlated agreement phenomena. The practical cryptographic applications and the mutual correlated agreement conjecture are detailed in "WHIR: Reed–Solomon Proximity Testing with Super-Fast Verification", whose authors identified this mathematical bottleneck as crucial for achieving optimal soundness in batched verification protocols. We are particularly grateful to Thomas Coratger from the Ethereum Foundation for his ongoing work in this area, including his comprehensive repository on proximity proofs which provides valuable pedagogical materials that helped inform our exposition.