Zero Knowledge in STARKs

Zero Knowledge

Jun 6, 2025

zk-in-starks-image

STARKs are one of the most exciting proof systems in the cryptography space today. They're transparent, scalable, post-quantum secure, and deeply mathematical. But one aspect of STARKs that can feel underexplained is how zero-knowledge is actually achieved. This blog post is here to fix that.

We dive into the elegant techniques behind zero-knowledge in STARKs. We'll learn why just adding randomness isn't enough and why the DEEP composition polynomial needs special attention.

Whether you're building a STARK prover or just trying to understand what happens under the hood, this post aims to give you the clearest and most accurate breakdown of how zero-knowledge is implemented in practice.

Notation

Before we get into the mechanics, let's define some recurring notation and parameters:

  • F\mathbb{F} — Finite field of size pp; F\mathbb{F}^* multiplicative subgroup of F\mathbb{F}.
  • K\mathbb{K} — Extension field of degree ee, K=pe|\mathbb{K}| = p^e.
  • G\mathbb{G} — Cyclic subgroup of F\mathbb{F}^* with generator gg, G=n|\mathbb{G}| = n.
  • H\mathbb{H} — Coset of a cyclic subgroup of F\mathbb{F}^*, H=m,m>n|\mathbb{H}| = m, m > n.
  • ZG(X)=Xn1Z_G(X) = X^n - 1 — vanishing polynomial of G\mathbb{G}.

Overview

STARKs (Scalable Transparent ARguments of Knowledge) are cryptographic proof systems that allow a prover to convince a verifier that a certain computation was performed correctly, without revealing the computation's internal details or requiring the verifier to re-execute it. They are particularly attractive due to their transparency (no trusted setup) and post-quantum security.

At their core, STARKs reduce computational integrity to statements about polynomials over finite fields. The prover constructs a low-degree polynomial encoding the execution trace of a computation, commits to it using a Merkle tree, and then proves its validity through algebraic checks and interactive proofs. The verifier only needs to inspect a few values and Merkle paths to be convinced with high confidence.

A typical STARK proof involves the following steps:

  • Trace Generation — Evaluate the state of the computation at each step and record it in a trace table.

    ri=[ri(0),ri(1),,ri(k1)]Fk \vec{r}_i = \left[r_i^{(0)}, r_i^{(1)}, \dots, r_i^{(k-1)}\right] \in \mathbb{F}^k
  • Polynomial Interpolation — Represent trace columns as low-degree polynomials.

    Tj(X)F[X],deg(Tj)<n T_j(X) \in \mathbb{F}[X], \quad \deg(T_j) < n
  • The Quotient Polynomial — Define transition and boundary constraints using polynomial identities.

  • Low-Degree Extension (LDE) — Extend trace and quotient polynomials to a larger evaluation domain to enable probabilistic low-degree testing.

    Tj(x),Q(x)evaluated at xH T_j(x),\quad Q(x) \quad \text{evaluated at } x \in \mathbb{H}
  • Commitment — Hash LDE evaluations into a Merkle tree to create a succinct commitment.

Commit(TjH),Commit(QH) \text{Commit}\left(T_j|_{\mathbb{H}}\right),\quad \text{Commit}\left(Q|_{\mathbb{H}}\right)
  • Trace Consistency Check — Perform evaluations outside the original domain to strengthen soundness by checking that the quotient polynomial is consistent with the constraints over the trace polynomials. Evaluate

    Tj(z),Q(z)for random zK(HG) T_j(z),\quad Q(z) \quad \text{for random } z \in \mathbb{K} \setminus (\mathbb{H} \cup \mathbb{G})

    and check that

    Q(z)=iϵiCi(T1(z),...,Tk(z),T1(gz),...,Tk(gz))Zi(z)Q(z) = \sum_i \epsilon_i \cdot \frac{C_i(T_1(z), ..., T_k(z), T_1(gz), ..., T_k(gz))}{Z_i(z)}

    where Zi(X)Z_i(X) is a vanishing polynomial over the constraint domain.

  • FRI Protocol — Prove that the DEEP composition polynomial has low degree.

    F(X)=iαiTi(X)Ti(z)Xz+iβiTi(gX)Ti(gz)Xgz+iγiQi(X)Qi(zS)XzS\begin{aligned} F(X) = \sum_i \alpha_i \cdot \frac{T_i(X) - T_i(z)}{X - z} + \sum_i \beta_i \cdot \frac{T_i(gX) - T_i(gz)}{X - gz} \\ + \sum_i \gamma_i \cdot \frac{Q_i(X) - Q_i(z^S)}{X - z^S} \end{aligned}

    where Q(X)=i=1Sλi1Qi(XS)Q(X) = \sum_{i=1}^{S} \lambda^{i-1} \cdot Q_i(X^S)

  • Verification — The verifier checks a small number of evaluations and Merkle paths to validate the proof.

Adding Zero Knowledge

Soundness ensures that false claims are rejected with high probability — but it doesn't prevent the verifier from learning something unintended about the prover’s witness. For full privacy, we want zero knowledge: the verifier should learn nothing about the internal computation beyond the fact that it was carried out correctly.

This is subtle in the context of STARKs. Even though no part of the witness is revealed directly, certain components — like evaluations in the Low-Degree Extension (LDE) domain, or out-of-domain (DEEP) queries — can unintentionally leak information.

To address this, modern STARK systems incorporate two distinct layers of masking:

  • Witness Masking — Before committing to the trace polynomials, the prover adds a random low-degree polynomial (drawn independently for each column) to each trace polynomial. These masks are chosen to vanish on the original evaluation domain G\mathbb{G}, ensuring they don’t affect constraint satisfaction. But outside G\mathbb{G} — especially on the larger evaluation domain H\mathbb{H} used for the LDE — they act as effective noise, hiding the true values.

  • DEEP composition polynomial Masking — Even after masking the witness, the constraint system can leak information through the DEEP composition polynomial. Since this polynomial encodes evaluations over H\mathbb{H}, and the verifier queries it at random points, it must also be masked. The prover adds a random low-degree polynomial to the final DEEP composition polynomial. Care must be taken to ensure that this additive mask maintains the degree bounds required for soundness and zero knowledge.

In summary: adding randomness alone isn’t enough — it must be structured randomness, carefully aligned with the algebraic structure of the proof system. Masking both the trace and the DEEP composition polynomial is essential for turning a sound STARK into a zero-knowledge STARK.

Masking the Witness Polynomials

The verifier interacts with the trace polynomials in two key ways: during the FRI phase, which tests their low-degree structure over the evaluation domain, and during the trace consistency check, which evaluates constraint expressions at random points outside the evaluation domain. Both of these can potentially leak information about the execution trace.

To achieve zero knowledge, we mask each trace column polynomial by adding structured randomness that hides its values outside the original trace domain, while preserving correctness within it. This is done by multiplying a random low-degree polynomial with the vanishing polynomial of the trace domain.

Let Ti(X)F[X]<nT_i(X) \in \mathbb{F}[X]^{<n} denote the iith trace polynomial interpolated over a domain GF\mathbb{G} \subset \mathbb{F^*}. The masked version is defined as:

T^i(X)=Ti(X)+ZG(X)Ri(X),where ZG(X)=gG(Xg), \hat{T}_i(X) = T_i(X) + Z_{\mathbb{G}}(X) \cdot R_i(X), \quad \text{where } Z_{\mathbb{G}}(X) = \prod_{g \in \mathbb{G}} (X - g),

and Ri(X)F[X]<hR_i(X) \leftarrow \mathbb{F}[X]^{<h} is a random masking polynomial of degree less than hh.

This masking ensures the following:

  • Correctness over the trace domain: For any xGx \in \mathbb{G}, we have ZG(x)=0Z_{\mathbb{G}}(x) = 0, so the masked polynomial agrees with the original:

    T^i(x)=Ti(x)xG. \hat{T}_i(x) = T_i(x) \quad \forall\, x \in \mathbb{G}.

    Thus, all transition and boundary constraints remain satisfied over the trace domain.

  • Randomness outside the domain: For any zGz \notin \mathbb{G}, the vanishing polynomial satisfies ZG(z)0Z_{\mathbb{G}}(z) \ne 0, so:

T^i(z)=Ti(z)+ZG(z)Ri(z), \hat{T}_i(z) = T_i(z) + Z_{\mathbb{G}}(z) \cdot R_i(z),

which is uniformly random in F\mathbb{F}, provided RiR_i is drawn uniformly. This ensures that the masked trace reveals nothing about the original trace at any queried point outside G\mathbb{G}.

  • Query privacy: Since all FRI queries and trace consistency checks occur at points outside G\mathbb{G}, each such evaluation of T^i\hat{T}_i appears statistically independent of the original TiT_i, protecting the witness from leakage.

image

Masking the DEEP Composition Polynomial

Even after masking the witness polynomials, zero-knowledge is not guaranteed. The issue arises during the FRI protocol, where the DEEP composition polynomial F(X)F(X) is tested for low degree. In this process, the verifier repeatedly folds F(X)F(X) to reduce its degree — and with each fold, some randomness from the original masking polynomials is also folded and lost.

A Note on Splitting the Quotient Polynomial

Before applying the FRI protocol, the quotient polynomial Q(X)Q(X) must be brought into a form suitable for low-degree testing. Specifically, we require all polynomials passed to FRI to have degree of some power of two, which allows efficient recursive folding.

We begin by multiplying Q(X)Q(X) by an appropriate factor to increase its degree to the next power of two:

deg(Q)<D,where D=2log2(deg(Q)).\deg(Q) < D, \quad \text{where } D = 2^{\lceil \log_2(\deg(Q)) \rceil}.

We then split Q(X)Q(X) into S=D/nS = D / n parts, where n=Gn = |\mathbb{G}| is the size of the trace domain. This ensures that each component has degree strictly less than nn, so that it can be committed and checked individually by FRI.

The decomposition is written as:

Q(X)=j=0S1XjQj(XS),with each Qj(X)F[X]<n.Q(X) = \sum_{j=0}^{S-1} X^j \cdot Q_j(X^S), \quad \text{with each } Q_j(X) \in \mathbb{F}[X]^{<n}.

This structure allows the verifier to check all QjQ_j in parallel, using a single low-degree protocol with deg(Qj)<n\deg(Q_j) < n, and supports efficient Merkle tree commitments and folding.

After splitting Q(X)Q(X) into SS components Qj(X)Q_j(X), the prover then constructs the DEEP composition polynomial F(X)F(X), which checks that:

  • Ti(X)T_i(X) agrees with its claimed value at zz,
  • Ti(gX)T_i(gX) agrees with its value at gzg z,
  • Each quotient component Qj(X)Q_j(X) agrees with its value at zSz^S.

This gives the final expression:

F(X)=iαiTi(X)Ti(z)Xz+iβiTi(gX)Ti(gz)Xgz+jγjQj(X)Qj(zS)XzS\begin{aligned} F(X) = \sum_i \alpha_i \cdot \frac{T_i(X) - T_i(z)}{X - z} \,+\, \sum_i \beta_i \cdot \frac{T_i(gX) - T_i(gz)}{X - gz} \,+\, \sum_j \gamma_j \cdot \frac{Q_j(X) - Q_j(z^S)}{X - z^S} \end{aligned}

where the αi,βi,γjF\alpha_i, \beta_i, \gamma_j \in \mathbb{F} are verifier-chosen randomizers to bind the prover to all constraint evaluations simultaneously.

This polynomial F(X)F(X) is now subject to a low-degree check via FRI, and its correctness implies that all DEEP queries were answered consistently — across both the trace and the quotient parts.

Why masking the DEEP composition polynomial is necessary

Each trace polynomial Ti(X)T_i(X) is masked by a polynomial of the form ZG(X)Ri(X)Z_{\mathbb{G}}(X) \cdot R_i(X), and the DEEP composition polynomial F(X)F(X) is computed from these masked traces. This means the randomness from Ri(X)R_i(X) is already embedded in F(X)F(X).

However, FRI recursively folds F(X)F(X) — and in each fold, the degree is halved. Crucially, this also halves the effective entropy of the randomness inside F(X)F(X). After several rounds, very little randomness remains, and the verifier may begin to observe statistical biases correlated with the underlying witness.

To prevent this leakage, we add fresh randomness directly to the polynomial before the FRI protocol begins.

Adding entropy via masking

Let G=n|\mathbb{G}| = n be the size of the trace domain, and let hh be the masking degree used for the witness polynomials. We define the final quotient DEEP composition polynomial as:

H(X)=F(X)+R(X), H(X) = F(X) + R(X),

where R(X)F[X]<n+h1R(X) \leftarrow \mathbb{F}[X]^{<n + h - 1} is a uniformly random masking polynomial of sufficiently high degree.

This fresh term R(X)R(X) does not affect any constraint checks (since FRI only verifies low-degree structure), but it ensures that the folded evaluations of H(X)H(X) remain statistically indistinguishable from random — even after multiple rounds of degree halving.

Choosing the masking degree hh

The degree of R(X)R(X) must be high enough to ensure that the prover’s responses remain statistically hidden even after many FRI rounds and trace consistency checks.

Let’s carefully account for the total number of degrees of freedom that could be leaked:

  • Trace consistency queries: Each check at a point zKz \in \mathbb{K} (an extension field of degree ee) reveals ee independent scalars over F\mathbb{F}. The number of such queries is nDn_D.
  • Shifted constraints: For each query, the verifier evaluates both T^i(z)\hat{T}_i(z) and T^i(gz)\hat{T}_i(gz), effectively doubling the leakage. This introduces a factor of 2.
  • Quotient splitting and implicit queries: After splitting Q(X)Q(X) into SS component polynomials Qj(X)Q_j(X), each evaluation at Qj(zS)Q_j(z^S) leaks information about the values of Q(X)Q(X) at the SS points zUz \cdot U, where UU is the group of SS-th roots of unity. Hence, the masking budget must scale with SS.
  • FRI-root quotient dependency: During FRI, each round operates on a domain composed of a coset of a subgroup of size SS, meaning that evaluating any one Qj(X)Q_j(X) at a FRI point implicitly depends on the values of Q(X)Q(X) at SS positions. Since quotient values are computed from the trace evaluations, and FRI folds over these positions recursively, the trace masking must account for 2SnF2 \cdot S \cdot n_F evaluations: one for each FRI query point and its gg-shift across the SS-root coset.
  • Direct FRI spot-checks: FRI also performs nFn_F spot-checks in its folded tables, and these are generally disjoint from the previous queries. Thus, they must be independently masked, contributing an additional nFn_F degrees.

Therefore, to protect against all forms of leakage, the masking degree must satisfy:

h    2S(enD+nF)  +  nF. h \;\ge\; 2 \cdot S \cdot (e \cdot n_D + n_F) \;+\; n_F.

Why this works

Lemma: Let T^1(X),,T^k(X)\hat{T}_1(X), \dots, \hat{T}_k(X) be masked trace polynomials as defined earlier, and suppose we split the quotient polynomial Q(X)Q(X) into SS components:

Q(X)=j=1SXjQj(XS),with each Qj(X)F[X]<n.Q(X) = \sum_{j=1}^{S} X^j \cdot Q_{j}(X^S), \quad \text{with each } Q_j(X) \in \mathbb{F}[X]^{<n}.

Let QDEEPF(HG)Q_{DEEP} \subset \mathbb{F} \setminus (\mathbb{H} \cup \mathbb{G}) and QFRIHQ_{FRI} \subset \mathbb{H} be sets of DEEP and FRI query points respectively, with

QFRInF,QDEEPnD.|Q_{FRI}| \le n_F, \quad |Q_{DEEP}| \le n_D.

Assume the masking degree hh satisfies the entropy bound:

h2d(enD+nF)+nF.h \ge 2 \cdot d \cdot (e\,n_D + n_F) + n_F.

Then the joint distribution of all queried values:

{(T^1(z),  T^1(gz),  ,  T^M(z),  T^M(gz),  Q1(zd),  ,  Qd(zd))for zQDEEP,(T^1(z),  ,  T^M(z),  Q1(z),  ,  Qd(z))for zQFRI}\left\{ \begin{aligned} &\left( \hat{T}_1(z),\; \hat{T}_1(gz),\; \dots,\; \hat{T}_M(z),\; \hat{T}_M(gz),\; Q_1(z^d),\; \dots,\; Q_d(z^d) \right) \quad \text{for } z \in Q_{DEEP}, \\ &\left( \hat{T}_1(z),\; \dots,\; \hat{T}_M(z),\; Q_1(z),\; \dots,\; Q_d(z) \right) \quad \text{for } z \in Q_{FRI} \end{aligned} \right\}

is statistically independent of the original (unmasked) witness polynomials T1(X),,TM(X)F[X]T_1(X), \dots, T_M(X) \in \mathbb{F}[X].

In other words, even after observing all DEEP queries, their gg-shifts, and FRI spot-checks on the quotient components, the verifier learns nothing about the witness values — provided the masking degree is large enough.

Toy Example

To make the abstract steps of zero-knowledge STARKs concrete, we will walk through a small, fully specified example over a finite field. This example will serve as a running reference, illustrating each part of the protocol with concrete values.

We work over the prime field:

F=F97.\mathbb{F} = \mathbb{F}_{97}.

The multiplicative group F97\mathbb{F}_{97}^* has order 96 and is cyclic. Let g=5F97g = 5 \in \mathbb{F}_{97} be a generator. We define two multiplicative subgroups:

  • Trace domain: G=g12\mathbb{G} = \langle g^{12} \rangle, a subgroup of order 8,
  • LDE domain: H=21g6H = \langle 21 * g^6 \rangle, a coset of order 16

This mirrors a typical STARK setup: we interpolate the trace over G\mathbb{G}, then extend it to HH for low-degree testing and constraint evaluation.

The Computation

We model a simple computation: an 8-step Fibonacci sequence, defined recursively as:

ti+2=ti+1+timod97.t_{i+2} = t_{i+1} + t_i \mod 97.

With initial values:

t0=89,t1=90.t_0 = 89, \quad t_1 = 90.

Evaluating the recurrence gives the following execution trace:

Step iitit_i
089
190
282
375
460
538
61
739

These values represent the internal state of our computation over time. In a STARK proof, we will interpolate these values into a trace polynomial over the trace domain G\mathbb{G}, then extend it to the larger domain H\mathbb{H} as part of the low-degree extension (LDE) phase.

Let T(X)T(X) be the unique polynomial of degree less than 8 that satisfies:

T(g12i)=tifor i=0,,7.T(g^{12 \cdot i}) = t_i \quad \text{for } i = 0, \dots, 7.

Interpolating the Fibonacci values:

[89, 90, 82, 75, 60, 38, 1, 39][89,\ 90,\ 82,\ 75,\ 60,\ 38,\ 1,\ 39]

we obtain the trace polynomial:

T(X)=55X7+71X6+18X5+23X4+34X3+91X2+53X+35F97[X].T(X) = 55X^7 + 71X^6 + 18X^5 + 23X^4 + 34X^3 + 91X^2 + 53X + 35 \in \mathbb{F}_{97}[X].

This polynomial encodes the entire trace over G\mathbb{G}, and will serve as the starting point for witness masking and constraint verification in the next steps.

Masking the Trace Polynomial

To ensure zero knowledge, we mask the trace polynomial by adding a multiple of the vanishing polynomial over the trace domain G\mathbb{G}. This preserves all values on G\mathbb{G}, but randomizes evaluations elsewhere.

Let ZG(X)=X81Z_{\mathbb{G}}(X) = X^8 - 1, the vanishing polynomial over G\mathbb{G}, and let r(X)F97[X]<hr(X) \in \mathbb{F}_{97}[X]^{<h} be a random masking polynomial. For this example, we choose:

R(X)=  71X23+51X22+79X21+31X20+22X19+45X18+3X17+30X16+57X15+48X14+94X13+44X12+20X11+65X10+9X9+86X8+19X6+37X5+32X4+54X3+26X2+88X+65.\begin{aligned} R(X) =\; &71X^{23} + 51X^{22} + 79X^{21} + 31X^{20} \\ &+ 22X^{19} + 45X^{18} + 3X^{17} + 30X^{16} \\ &+ 57X^{15} + 48X^{14} + 94X^{13} + 44X^{12} \\ &+ 20X^{11} + 65X^{10} + 9X^9 + 86X^8 \\ &+ 19X^6 + 37X^5 + 32X^4 + 54X^3 \\ &+ 26X^2 + 88X + 65. \end{aligned}

where the degree is chosen as follows: the number of FRI queries is nF=3n_F = 3, the number of DEEP queries is nD=1n_D = 1, the splitting factor for the quotient polynomial is S=2S = 2. With these numbers we can calculate the lower bound for hh as h>22(21+3)+3=23h > 2 * 2(2*1 + 3) + 3 = 23.

The masked trace polynomial is:

T^(X)=T(X)+(X81)R(X).\hat{T}(X) = T(X) + (X^8 - 1) \cdot R(X).

Substituting in the expression for T(X)T(X), we get:

T^(X)=  55X7+71X6+18X5+23X4+34X3+91X2+53X+35+(X81)(71X23+51X22+79X21+31X20+22X19+45X18+3X17+30X16+57X15+48X14+94X13+44X12+20X11+65X10+9X9+86X8+19X6+37X5+32X4+54X3+26X2+88X+65)\begin{aligned} \hat{T}(X) =\; &55X^7 + 71X^6 + 18X^5 + 23X^4 + 34X^3 + 91X^2 + 53X + 35 \\ &+ (X^8 - 1)\cdot(71X^{23} + 51X^{22} + 79X^{21} + 31X^{20} \\ &\qquad + 22X^{19} + 45X^{18} + 3X^{17} + 30X^{16} + 57X^{15}\\ &\qquad + 48X^{14} + 94X^{13} + 44X^{12} + 20X^{11} + 65X^{10}\\ &\qquad + 9X^9 + 86X^8 + 19X^6 + 37X^5 + 32X^4\\ &\qquad + 54X^3 + 26X^2 + 88X + 65)\\ \end{aligned}

This polynomial agrees with T(X)T(X) at all points in G\mathbb{G}, but appears pseudorandom at any zGz \notin \mathbb{G}, such as those used in trace consistency checks or FRI.

Conclusion

Zero‑knowledge in STARKs rests on two complementary masks:

  • Witness masking, which hides the raw execution trace outside G\mathbb{G}.
  • DEEP composition polynomial masking, which protects the polynomial used in FRI, ensuring entropy never depletes.

Together, these ensure that a STARK proof reveals nothing beyond the truth of the statement, while preserving transparency, scalability, and post‑quantum security.

Acknowledgments

This post is based on and inspired by the excellent 2024 technical note “A Note on Adding Zero-Knowledge to STARKs” by Ulrich Haböck and Al Kindi. Their paper offers a clear and rigorous treatment of witness masking, quotient decomposition, and entropy accounting in small-field STARK systems. Many of the ideas and formal guarantees discussed here — especially those involving FRI, DEEP queries, and the structure of masking polynomials — are direct adaptations or intuitive explanations of results presented in that work.