Gröbner Bases in Cryptanalysis Part2: Attacking Poseidon

Cryptography

Apr 2, 2026

groebner-basis-part2-image

This is Part 2 of a two-part series on Gröbner bases and algebraic cryptanalysis. Part 1 developed the algebraic machinery: polynomial rings, ideals, monomial orderings, Buchberger's algorithm, and solving polynomial systems via elimination. If you have not read Part 1, the key takeaway is this: given a system of polynomial equations, computing a Gröbner basis with respect to lexicographic order reduces it to a triangular system from which the solutions can be read off by back-substitution. Part 2 puts that machinery to work.

The target is Poseidon, a cryptographic hash function designed for use inside zero-knowledge proof systems. We show how the Poseidon permutation can be transcribed into a polynomial system over a prime field, and demonstrate a complete preimage attack on a reduced-round instance: given a known hash digest, recover the input that produced it. The attack is implemented in SageMath.

The article closes with a discussion of the algorithmic improvements, specifically F4, F5, and the FGLM basis conversion, that make these attacks practical beyond toy parameters.

Arithmetic Ciphers

Why Arithmetic Ciphers

Classical symmetric-key primitives such as AES are designed to be efficient on general-purpose hardware. Their round functions mix bitwise operations (XOR, shifts, table lookups) with modular arithmetic, and their security arguments are statistical: a good cipher should behave like a random permutation with respect to differential and linear distinguishers. The price of this design philosophy is that the cipher, when expressed as a polynomial system over a large field, produces equations of extremely high degree and high density. Algebraic attacks are therefore computationally hopeless before they begin.

A new class of cryptographic protocols changes this picture entirely. Zero-knowledge proof systems (ZK-SNARKs, ZK-STARKs), fully homomorphic encryption (FHE), and secure multi-party computation (MPC) all require the prover or evaluating party to compute over a native algebraic domain, typically the scalar field Fp\mathbb{F}_p of an elliptic curve, and to express every computation as an arithmetic circuit over that field. The cost of such a protocol is dominated by the number of multiplication gates in the circuit, since additions are essentially free. When a hash function or cipher is used inside such a protocol, as it frequently is in Merkle tree constructions and commitment schemes for ZK proofs, the internal operations of the cipher become part of the circuit. A cipher built from XOR and bit rotations, like AES, incurs an enormous overhead when arithmetised: each bitwise operation requires many field multiplications to simulate. A cipher built natively from field multiplications does not.

This observation motivated the design of arithmetization-oriented ciphers (AOCs): symmetric-key primitives defined entirely over Fp\mathbb{F}_p, using only field addition and low-degree power maps as operations, so that the number of multiplication gates is minimised. The statistical hardness arguments that protect AES rely on the complexity of the cipher's algebraic description. An AOC has a simple algebraic description by design, and it is therefore vulnerable to exactly the algebraic attacks that AES resists.

MiMC

MiMC (Minimising Multiplicative Complexity) was introduced by Albrecht et al. in 2016 as one of the first AOCs. It operates over a prime field Fp\mathbb{F}_p (or a binary extension field, though we consider only the prime field case here) and uses the power map xx3x \mapsto x^3 as its sole nonlinear operation, chosen because cubing requires only one multiplication over a field where gcd(3,p1)=1\gcd(3, p-1) = 1.

Construction. Fix a prime pp with gcd(3,p1)=1\gcd(3, p-1) = 1, a number of rounds rr, and a sequence of round constants c0,c1,,cr1Fpc_0, c_1, \ldots, c_{r-1} \in \mathbb{F}_p with c0=0c_0 = 0. The MiMC encryption of a plaintext x0Fpx_0 \in \mathbb{F}_p under key kFpk \in \mathbb{F}_p is defined by the iteration

xi+1=(xi+k+ci)3,i=0,1,,r1,x_{i+1} = (x_i + k + c_i)^3, \qquad i = 0, 1, \ldots, r-1,

with ciphertext y=xr+ky = x_r + k. The decryption direction uses the inverse map xx31mod(p1)x \mapsto x^{3^{-1} \bmod (p-1)}, which exists precisely because gcd(3,p1)=1\gcd(3, p-1) = 1.

The number of rounds is set to r=log3pr = \lceil \log_3 p \rceil to ensure that the composition of round functions has degree at least p1p-1 as a univariate polynomial over Fp\mathbb{F}_p, matching the degree of a random permutation polynomial on Fp\mathbb{F}_p.

MiMC used as a hash function follows the sponge or Miyaguchi-Preneel construction using the cipher as its core permutation. In the sponge setting, a message is absorbed into a state via repeated application of the permutation interleaved with XOR.

Algebraic structure. The entire encryption map is a composition of degree-3 polynomials, so as a univariate polynomial in x0x_0 it has degree 3r3^r. As a polynomial in the key kk however, MiMC's encryption map has degree only r+1r + 1. This asymmetry is what makes the algebraic attack tractable: rather than solving a univariate polynomial of degree 3r3^r in the key, we introduce intermediate variables and work with a multivariate system of low degree.

Poseidon

Poseidon was introduced by Grassi et al. and presented at USENIX Security 2021. It was designed explicitly for ZK proof systems and addresses some of the algebraic vulnerabilities of MiMC by using a wider state and a more structured round function inspired by the HADES design strategy.

Construction. Poseidon operates on a state of tt field elements over Fp\mathbb{F}_p, where t2t \geq 2 and the prime pp is typically the scalar field prime of a pairing-friendly elliptic curve such as BLS12-381. The round function consists of three layers applied in sequence:

  1. AddRoundConstants (ARC): Add a vector of public round constants c(i)Fptc^{(i)} \in \mathbb{F}_p^t to the state.
  2. SubWords (S-box layer): Apply the power map xxαx \mapsto x^\alpha to the state, where α\alpha is the smallest positive integer with gcd(α,p1)=1\gcd(\alpha, p-1) = 1. Common choices are α{3,5,7,11}\alpha \in \{3, 5, 7, 11\}.
  3. MixLayer (linear layer): Multiply the state by a fixed t×tt \times t MDS matrix MM over Fp\mathbb{F}_p.

Poseidon uses two types of rounds. In the RFR_F full rounds, the S-box is applied to all tt state elements. In the RPR_P partial rounds, the S-box is applied to only one state element and the remaining t1t - 1 elements pass through the linear layer unchanged. The total number of rounds is R=RF+RPR = R_F + R_P. A single Poseidon permutation π:FptFpt\pi: \mathbb{F}_p^t \to \mathbb{F}_p^t applies RF/2R_F/2 full rounds, then RPR_P partial rounds, then RF/2R_F/2 full rounds.

Formally, let s(0)Fpt\mathbf{s}^{(0)} \in \mathbb{F}_p^t denote the initial state. After round ii, the state is

s(i+1)=MS(i) ⁣(s(i)+c(i))\mathbf{s}^{(i+1)} = M \cdot S^{(i)}\!\left(\mathbf{s}^{(i)} + c^{(i)}\right)

where S(i)S^{(i)} applies xxαx \mapsto x^\alpha to all coordinates in full rounds and to only the first coordinate in partial rounds.

Poseidon is used as a hash function via the sponge construction. The state is split into a rate portion of rr elements and a capacity portion of c=trc = t - r elements, where the capacity determines the security level: a collision attack requires 2c/22^{c/2} work and a preimage attack requires 2c2^c work against the sponge. Typically cc is chosen so that c2λ/log2pc \geq 2\lambda/\log_2 p for a security parameter λ\lambda.

Design rationale against algebraic attacks. The parameters RFR_F and RPR_P are chosen such that the degree of regularity of the multivariate system is pushed above the computational threshold of a Gröbner basis attacker. Concretely, the designers require RF6R_F \geq 6 to ensure security against Gröbner basis attacks on the full-round permutation.

Turning a Cipher into a Polynomial System

We now describe the general procedure for transcribing an AOC into a polynomial system over Fp\mathbb{F}_p, which is the first step of the attack.

Variables. Introduce a variable for each unknown. In a preimage or key-recovery attack on MiMC, the unknown is the key kFpk \in \mathbb{F}_p. The intermediate state values x1,,xr1x_1, \ldots, x_{r-1} are also unknown, so we introduce a variable for each. The plaintext x0x_0 and ciphertext yy are known and treated as constants. The full variable set is

{k,x1,x2,,xr1}Fp[k,x1,,xr1].\{k, x_1, x_2, \ldots, x_{r-1}\} \subset \mathbb{F}_p[k, x_1, \ldots, x_{r-1}].

Round equations. Each round relation becomes a polynomial equation by rearranging to zero. For MiMC, the round relation xi+1=(xi+k+ci)3x_{i+1} = (x_i + k + c_i)^3 gives the polynomial

fi=xi+1(xi+k+ci)3Fp[k,x1,,xr1],i=0,,r1,f_i = x_{i+1} - (x_i + k + c_i)^3 \in \mathbb{F}_p[k, x_1, \ldots, x_{r-1}], \qquad i = 0, \ldots, r-1,

where x0x_0 is the known plaintext and xrx_r is recovered from the output equation y=xr+ky = x_r + k, giving the additional relation fr=yxr1+1kf_r = y - x_{r-1+1} - k (suitably indexed). Each fif_i has degree 3 in the ring variables.

Field equations. Over Fp\mathbb{F}_p, every element satisfies ap=aa^p = a, equivalently apa=0a^p - a = 0. Adding the field equations

ϕj=xjpxj,j{k,x1,,xr1}\phi_j = x_j^p - x_j, \qquad j \in \{k, x_1, \ldots, x_{r-1}\}

to the system forces all solutions of the ideal to lie in Fp\mathbb{F}_p rather than in an algebraic closure Fp\overline{\mathbb{F}}_p. This makes the ideal zero-dimensional, which is a necessary condition for the Gröbner basis computation to yield a finite solution set and for the Elimination Theorem to apply.

The ideal. The polynomial system for a key-recovery attack on rr-round MiMC with known plaintext-ciphertext pair (x0,y)(x_0, y) is

I=xi+1(xi+k+ci)3  |  i=0,,r1+xjpxj  |  xj{k,x1,,xr1}Fp[k,x1,,xr1].I = \left\langle\, x_{i+1} - (x_i + k + c_i)^3 \;\middle|\; i = 0, \ldots, r-1 \,\right\rangle + \left\langle\, x_j^p - x_j \;\middle|\; x_j \in \{k, x_1, \ldots, x_{r-1}\} \,\right\rangle \subset \mathbb{F}_p[k, x_1, \ldots, x_{r-1}].

The variety V(I)FprV(I) \subset \mathbb{F}_p^r is the set of all (k,x1,,xr1)(k, x_1, \ldots, x_{r-1}) consistent with the known plaintext-ciphertext pair. Computing a Gröbner basis for II with respect to lex order, with kk as the first variable, eliminates the intermediate states and yields a univariate polynomial in kk whose roots are the candidate keys.

The Poseidon system. For Poseidon, the procedure is the same in structure but involves a state vector rather than a scalar. Each round contributes tt polynomial equations, one per state component, expressing the next state in terms of the current one:

For a full round:

sj(i+1)=l=0t1Mjl(sl(i)+cl(i))α,j=0,,t1.s_j^{(i+1)} = \sum_{l=0}^{t-1} M_{jl} \cdot \left(s_l^{(i)} + c_l^{(i)}\right)^\alpha, \qquad j = 0, \ldots, t-1.

For a partial round, only l=0l = 0 gets the S-box:

sj(i+1)=Mj0(s0(i)+c0(i))α+l=1t1Mjl(sl(i)+cl(i)),j=0,,t1.s_j^{(i+1)} = M_{j0} \cdot \left(s_0^{(i)} + c_0^{(i)}\right)^\alpha + \sum_{l=1}^{t-1} M_{jl} \cdot \left(s_l^{(i)} + c_l^{(i)}\right), \qquad j = 0, \ldots, t-1.

Each full round contributes tt equations of degree α\alpha. Each partial round contributes tt equations of which the one involving the S-box input has degree α\alpha and the remaining t1t - 1 are linear.

The Attack

Poseidon is used as a hash function via the sponge construction. In the single-block setting we target here, a state of tt field elements over Fp\mathbb{F}_p is initialised to zero. A block of rate field elements, the preimage, is added into the rate portion of the state. The Poseidon permutation is applied once. The first rate elements of the resulting state are output as the digest.

Concretely, with t=2t = 2 and rate=1\text{rate} = 1:

  • Input (preimage): a single unknown field element x0Fpx_0 \in \mathbb{F}_p
  • Initial state: (x0,0)(x_0, 0)
  • Permutation: apply Poseidonπ^\pi to (x0,0)(x_0, 0)
  • Output (digest): the first element of the resulting state

The attack is a preimage attack: given the digest, find x0x_0. We model the permutation as a polynomial system in the unknown x0x_0 and a set of auxiliary variables representing intermediate state values, compute its Gröbner basis, and extract x0x_0 from the variety.

The implementation presented here uses the official Poseidon reference implementation from the IAIK TU Graz hadeshash repository, combined with the polynomial system construction from the AS Discrete Mathematics Gröbner basis benchmarks repository.

The Poseidon Permutation

The Poseidon class below implements the permutation for any choice of parameters. It defaults to the full production instance x3x^3-Poseidonπ^\pi with (n,t,RF,RP)=(64,24,8,42)(n, t, R_F, R_P) = (64, 24, 8, 42) over Fp\mathbb{F}_p where p=264281p = 2^{64} - 2^8 - 1, whose round constants and MDS matrix are hardcoded and generated by the Grain LFSR as described in the Poseidon paper. This full instance is secure, as of the time of writing, against the attack presented here. The attack targets reduced-round instances instantiated with the same class but with RFR_F and RPR_P set far below the recommended minimums.

class Poseidon:
    def __init__(self, prime=2^64-2^8-1, R_F=8, R_P=42, t=24):
        F = GF(prime)
        self.t, self.R_F, self.R_P, self.prime, self.F = t, R_F, R_P, prime, F

        # [Round constants and MDS matrix omitted for brevity]

        self.MDS_matrix_field = matrix(F, [[F(int(MDS_matrix[i][j], 16)) for j in range(t)] for i in range(t)])
        self.round_constants_field = [F(int(r, 16)) for r in round_constants[:(R_F + R_P) * t]]
        self.round_constants_field_new = self._calc_equivalent_constants()
        self.round_constants_field = [self.round_constants_field[index:index+t]
                                       for index in range(0, len(self.round_constants_field), t)]
        self.M_i, self.v_collection, self.w_hat_collection, self.M_0_0 = self._calc_equivalent_matrices()

The constructor converts all constants to field elements and calls two private methods to precompute the data structures needed for an optimised equivalent form of the permutation described in Appendix B of the Poseidon paper. The result is mathematically identical to the reference permutation perm_original but significantly faster for large tt. The details of the optimisation are not needed to follow the attack.

Permutations

The class provides two implementations of the permutation. The reference implementation perm_original follows the specification directly. The optimised permutation perm is mathematically equivalent.

    def perm_original(self, input_words):
        t, R_F, R_P = self.t, self.R_F, self.R_P
        MDS_matrix_field = self.MDS_matrix_field
        round_constants_field = self.round_constants_field
        state_words = list(input_words)
        for r in range(R_F + R_P):
            state_words = [state_words[i] + round_constants_field[r][i] for i in range(t)]
            if r < (R_F // 2) or r >= (R_F // 2 + R_P):  # full round
                state_words = [state_words[i]^3 for i in range(t)]
            else:                                           # partial round
                state_words[0] = state_words[0]^3
            state_words = list(MDS_matrix_field * vector(state_words))
        return state_words

Each round applies AddRoundConstants, then SubWords (either to all tt state elements in a full round, or to only the first element in a partial round), then MixLayer. The round structure is RF/2R_F/2 full rounds, then RPR_P partial rounds, then RF/2R_F/2 full rounds.

Setting Up the Polynomial System

The function poseidon_last_squeeze_poly_system constructs the polynomial system whose variety is the set of preimages consistent with the known digest. We describe its construction in detail.

Variable count. The polynomial ring is declared with tRF+RPcapt \cdot R_F + R_P - \text{cap} variables, where cap=trate\text{cap} = t - \text{rate} is the capacity. The variables fall into four groups:

  • rate\text{rate} variables for the unknown preimage elements,
  • rate\text{rate} variables for the post-S-box state in the first round (capacity post-S-box values are constants since the capacity input is zero),
  • one variable per S-box output in each subsequent intermediate round (full rounds contribute tt variables each, partial rounds contribute 11),
  • cap\text{cap} variables for the unknown capacity portion of the final output.

The total is 2rate+(RF2)t+RP+cap=tRF+RPcap2 \cdot \text{rate} + (R_F - 2) \cdot t + R_P + \text{cap} = t \cdot R_F + R_P - \text{cap}.

First round. The initial state has the unknown preimage x0,,xrate1x_0, \ldots, x_{\text{rate}-1} in the rate positions and zero in the capacity positions. After AddRoundConstants and SubWords (a full round), the rate elements become (xi+ci(0))3(x_i + c_i^{(0)})^3 and the capacity elements become (ci(0))3(c_i^{(0)})^3, the latter being constants. Introducing variables xrate,,x2rate1x_{\text{rate}}, \ldots, x_{2\cdot\text{rate}-1} for the post-S-box rate values gives the first batch of equations:

fi=xi+rate(xi+ci(0))3,i=0,,rate1.f_i = x_{i + \text{rate}} - (x_i + c_i^{(0)})^3, \qquad i = 0, \ldots, \text{rate}-1.
system = [xs[i + rate] - (xs[i] + round_constants_field[0][i])^3 for i in range(rate)]
curr_state = [xs[i + rate] for i in range(rate)] \
           + [round_constants_field[0][i]^3 for i in range(rate, t)]
curr_state = list(MDS_matrix_field * vector(curr_state))

After the MDS matrix is applied (a linear operation that introduces no new variables), curr_state is a vector of polynomials in the current variables describing the state at the start of round 1.

Intermediate rounds. For rounds r=1,,RF+RP2r = 1, \ldots, R_F + R_P - 2, the state is advanced through AddRoundConstants and SubWords. New variables are introduced for the post-S-box values to keep each polynomial at degree 3:

for r in range(1, num_rounds - 1):
    curr_state = [curr_state[i] + round_constants_field[r][i] for i in range(t)]
    if r < (R_F // 2) or r >= (R_F // 2) + R_P:  # full round
        curr_state = [curr_state[i]^3 for i in range(t)]
        system += [xs[xs_idx + i] - curr_state[i] for i in range(t)]
        curr_state = xs[xs_idx:xs_idx + t]
        xs_idx += t
    else:                                           # partial round
        curr_state[0] = curr_state[0]^3
        system += [xs[xs_idx] - curr_state[0]]
        curr_state[0] = xs[xs_idx]
        xs_idx += 1
    curr_state = list(MDS_matrix_field * vector(curr_state))

Last round. The last round is handled by inverting the MDS matrix on the known output. The known digest fills the rate portion of the final state; the unknown capacity portion introduces cap\text{cap} new variables. Applying M1M^{-1} to this vector gives the pre-MDS state, which must equal the post-S-box state propagated forward from the penultimate round:

curr_state = [curr_state[i] + round_constants_field[-1][i] for i in range(t)]
curr_state = [curr_state[i]^3 for i in range(t)]
final_state = hash_digest + [xs[xs_idx + i] for i in range(cap)]
final_state = list(MDS_matrix_field.inverse() * vector(final_state))
system += [final_state[i] - curr_state[i] for i in range(t)]

This avoids introducing another layer of variables for the last round's S-box outputs, reducing the variable count by tt compared to a naive forward-only construction.

The complete ideal is:

I=f1,,fmFp[x0,,xtRF+RPcap1]I = \left\langle f_1, \ldots, f_m \right\rangle \subset \mathbb{F}_p[x_0, \ldots, x_{t \cdot R_F + R_P - \text{cap} - 1}]

where each generator has degree at most 3. The variety V(I)V(I) over Fp\mathbb{F}_p is the set of all intermediate state traces consistent with the known digest.

The Trace Function

Before running the Gröbner basis computation, it is worth verifying that the polynomial system has been constructed correctly. Since the attack is demonstrated on a known test case where the preimage is available, we can check the system by substituting the correct answer into every equation and confirming that each evaluates to zero. The function poseidon_last_squeeze_trace produces this known-good solution by running the permutation forward from the preimage and recording the concrete value of every variable at the exact point the polynomial system introduces it. In a real attack the preimage is unknown and the trace cannot be computed. It is purely a verification tool.

def poseidon_last_squeeze_trace(poseidon, preimage):
    t, R_F, R_P = poseidon.t, poseidon.R_F, poseidon.R_P
    round_constants_field = poseidon.round_constants_field
    MDS_matrix_field = poseidon.MDS_matrix_field
    num_rounds = R_F + R_P
    rate = len(preimage)
    cap = t - rate
    state = matrix([preimage + [0]*cap]).transpose()
    trace = copy(preimage)
    # first round
    state += matrix([[round_constants_field[0][i] for i in range(t)]]).transpose()
    state = matrix([[el[0]^3 for el in state.rows()]]).transpose()
    trace += [el[0] for el in state.rows()[0:rate]]
    state = MDS_matrix_field * state
    # all other rounds but last
    for r in range(1, num_rounds - 1):
        state += matrix([round_constants_field[r][i] for i in range(t)]).transpose()
        if r < (R_F // 2) or r >= (R_F // 2) + R_P:  # full round
            state = matrix([el[0]^3 for el in state.rows()]).transpose()
            trace.extend(el[0] for el in state.rows())
        else:                                           # partial round
            state[0,0] = state[0,0]^3
            trace.extend([state[0,0]])
        state = MDS_matrix_field * state
    # last round: record only the unknown capacity elements of the output
    state += matrix([[round_constants_field[-1][i] for i in range(t)]]).transpose()
    state = matrix([el[0]^3 for el in state.rows()]).transpose()
    state = MDS_matrix_field * state
    trace.extend(el[0] for el in state.rows()[rate:t])
    return trace

Running the Attack

We instantiate Poseidon with a small prime and a reduced round count well below the secure minimum of RF6R_F \geq 6. For this demonstration the preimage is [0], padded with a zero capacity element to form the full initial state [0, 0]. We hash it to obtain the digest, which is then treated as the only known quantity. The goal is to recover the preimage [0] from the digest alone.

prime, R_F, R_P, t, rate = 9973, 2, 2, 2, 1
cap = t - rate
input_sequence = [0] + [0]*cap      # preimage [0] with zero capacity padding
poseidon = Poseidon(prime=prime, R_F=R_F, R_P=R_P, t=t)
hash_digest = poseidon(input_sequence)[:rate]
print(f"Digest to invert: {hash_digest}")   # the only thing the attacker knows

ring = PolynomialRing(GF(prime), 'x', t*R_F + R_P - cap)
system = poseidon_last_squeeze_poly_system(poseidon, ring.gens(), hash_digest)

gb = Ideal(system).groebner_basis()
variety = Ideal(gb).variety()

## The variety contains multiple candidates. Verify each against the digest.

for solution in variety:
    candidate = solution[ring.gen(0)]
    if poseidon([candidate] + [0]*cap)[:rate] == hash_digest:
        print(f"Preimage recovered: {candidate}")

The system is passed to SageMath's .groebner_basis() method, which calls the Singular backend and returns the reduced Gröbner basis with respect to the default grevlex order. The variety is extracted with .variety(), which performs the root-finding internally. Since the degree-3 equations can have multiple roots over Fp\mathbb{F}_p, the variety contains several candidates, and the correct preimage is identified by hashing each one and checking which matches the known digest:

Preimage recovered: 728
Preimage recovered: 0
Preimage recovered: 6038
Preimage recovered: 9287
Preimage recovered: 5974
Preimage recovered: 1478

Improvements

The attack in the previous section uses SageMath's default Gröbner basis computation, which calls the Singular backend and applies a variant of Buchberger's algorithm. For the small toy instances demonstrated there, this is entirely adequate. For anything approaching a realistic parameter set, however, three improvements are essential: a faster basis algorithm, a smarter choice of monomial ordering during computation, and an efficient conversion between orderings.

The F4 and F5 Algorithms

Buchberger's algorithm processes S-polynomials one at a time. Each S-polynomial reduction is a polynomial division, and a significant fraction of these reductions produce zero remainders, contributing nothing to the basis. The F4 algorithm, introduced by Faugère in 1999, reformulates the Gröbner basis computation as a sequence of sparse matrix reductions over Fp\mathbb{F}_p. Instead of reducing one S-polynomial at a time, F4 collects all S-polynomials up to a given degree into a single Macaulay matrix, whose rows are the S-polynomials and their multiples, and row-reduces the entire matrix at once using efficient linear algebra. This amortises the cost of the reductions and allows the algorithm to exploit the sparsity of the system. In practice, F4 is orders of magnitude faster than Buchberger for the polynomial systems arising from AOC attacks.

The F5 algorithm, introduced by Faugère in 2002, goes further by attaching a signature to each polynomial that tracks how it was derived. These signatures allow F5 to detect and discard zero reductions before performing them, eliminating the redundant work entirely rather than simply batching it. F5 is the state of the art for dense systems over finite fields, and for the polynomial systems arising from Poseidon it achieves significantly lower memory consumption than F4 by avoiding the construction of unnecessarily large Macaulay matrices.

In SageMath, F4 is accessible via:

gb = Ideal(system).groebner_basis(algorithm='libsingular:groebner')

For larger systems where SageMath's Singular interface becomes a bottleneck, Faugère's FGb library provides a standalone C implementation of F4 with a SageMath interface via the fgb_sage package:

import fgb_sage
gb = fgb_sage.groebner_basis(Ideal(system))

Computing in Grevlex and Converting to Lex

A fundamental tension underlies the attack. The lex ordering is what we ultimately need: by the Elimination Theorem, a lex Gröbner basis with the preimage variables first automatically contains a univariate polynomial in those variables, from which the preimage is recovered by root-finding. However, computing a Gröbner basis directly in lex order is substantially more expensive than computing one in grevlex order. This is because lex forces the algorithm to perform variable elimination incrementally as it builds the basis, generating intermediate polynomials with very large leading monomials. Grevlex, by contrast, produces the smallest leading monomials on average and keeps the Macaulay matrices sparse throughout the computation.

The standard strategy is therefore to separate the two phases: compute the Gröbner basis in grevlex cheaply, then convert to lex at a lower cost than computing in lex directly. For a zero-dimensional ideal, this conversion can be performed by the FGLM algorithm (Faugère, Gianni, Lazard, Mora, 1993). In SageMath, the conversion is available via the transformed_basis method:

gb_drl = Ideal(system).groebner_basis(algorithm='libsingular:groebner')  # grevlex
R_lex = PolynomialRing(GF(prime), ring.variable_names(), order='lex')
gb_lex = Ideal([R_lex(f) for f in gb_drl]).transformed_basis('fglm')

For larger systems where FGLM itself becomes expensive, the Sparse-FGLM algorithm offers further improvements by exploiting the sparsity of the multiplication matrices. Together, the grevlex computation followed by FGLM conversion is the practical pipeline used in all serious Gröbner basis attacks on AOCs. The full basis computation in lex is avoided entirely.

Conclusion

This article demonstrated a complete Gröbner basis preimage attack on a reduced-round instance of the Poseidon hash function. Starting from a known digest, we transcribed the permutation into a polynomial system over Fp\mathbb{F}_p, computed its Gröbner basis using SageMath, and recovered the preimage from the variety. The gap between this demonstration and a practical attack is bridged by the algorithmic pipeline outlined in the improvements section: grevlex computation with F4 or F5, followed by FGLM conversion to lex.

Poseidon's security against this attack class rests on the degree of regularity growing beyond computational reach as the round count increases. The requirement RF6R_F \geq 6 is where that threshold currently sits. How it moves as algorithms and hardware improve is an open question, and one that the designers of any arithmetization-oriented primitive need to keep revisiting.