Introduction
On June 17th, 2024, Linea requested an audit of the wizard crypto and math libraries, as part of their zkEVM project. The audit was conducted by two zkSecurity engineers over a period of three weeks. The audit focused on the correctness and security of parts of the libraries, as well as the safety of their interface.
We found the code to be well documented and organized.
Scope
The scope of the audit included two inner components of the zkEVM repository.
The math/
component, which consisted of:
- Definition of the scalar field of the BLS12-377 curve.
- An implementation of FFT functions and related polynomial utility functions.
- An implementation of Smartvectors, an abstraction over field element vectors.
The crypto/
component, which consisted of:
- Three cryptographic hash implementations: Keccak, MiMC, and RingSIS.
- A sparse merkle tree implementation, and an abstraction for a key-value store built as a wrapper around the sparse merkle tree.
- A code-based polynomial commitment scheme: Vortex.
Below, we provide an overview of some of the subcomponents, before diving into findings and recommendations.
The math/ Accumulator and Sparse Merkle Tree
In order to enable verifiable accesses to the state of the world (in Linea), the zkEVM repository implements a key-value store on top of a sparse merkle tree implementation. In this section we describe exactly how.
A doubly-linked list for a key-value store. A continuous array is used to emulate a contiguous region of memory that can be written to in order from left to right, in a write-once fashion.
This array of memory is used to store a doubly-linked list. To do that, the first two entries are initialized with the head and the tail of the doubly-linked list. An element of the doubly-linked list is a tuple (hash(key), hash(value))
and the doubly-linked list is ordered by the hashed key of its elements.
Update to the key-value store. The doubly-linked list serves as a nice way to provide verifiable updates to the key-value store. For example, insertions can prove that they are inserting a key in the right place in the doubly-linked list, by proving that:
- the previous element in the list has a smaller hashed key
- the next element in the list has a bigger hashed key
- these “sandwiching elements” points at each other, and can be updated to point at the newly inserted element instead
In addition, the key-value store can make its entries unique by enforcing strict comparisons between the hashed keys.
Merkelizing the memory region. In order to record the state of the world between proved state transitions, and verifiably access and update the doubly-linked-list, the memory containing the elements of the list is merkelized.
This way, its root can be passed in between state transitions, and its state can be deconstructed to verifiably access entries of the heap and update them. We provide an example of such a merkelized memory region (containing a simple doubly-linked list) below.
Implementation details. Note that the implementation accesses and updates the merkle tree in multiple transitions instead of a single atomic update. We illustrate this in the following diagram.
The Vortex polynomial commitment scheme
In this section we briefly explain how the Vortex PCS works.
First, Vortex is based on codes. So let’s briefly mention how encoding using Reed-Solomon works. Imagine that you want to encode a message into a larger codeword. The idea is that an invalid message is easier to observe if it is stretched into a larger array of values (which would amplify its anomalies).
The Reed-Solomon code does that by looking at the message as if it contains the coefficients or the evaluations of a polynomial, and then evaluate that polynomial at more points. For example:
In the previous example we choose to see the message like it is a polynomial in Lagrange form (we have its evaluations over some domain). Since the domain evaluated on is an extension that preserves the points of the original domain, we obtain a systematic code: the original message is embedded in the codeword.
In ZK, we deal with arithmetic circuits and so we are interested in a domain that’s a multiplicative subgroup in a field (specifically a subgroup of order a power of two, because this plays nice with FFTs, and FFTs allow us to quickly evaluate a polynomial on a domain, while iFFT allows us to quickly interpolate a polynomial from its evaluations).
In our previous example, the original “message domain” was of size . So what happens if we change the domain? The original elements of the message are still embedded in the codeword but at different positions: this is still a systematic code.
This is because of how we find embedded subgroups of order powers of two in our field:
Back to what Vortex provides: it allows you to get the evaluations of a number of polynomials at the same point.
So we start with a number of polynomials in lagrange form (their evaluations over some points ) which you can see as a matrix :
In Vortex, the prover encodes all the rows of the matrix, getting a larger matrix. The greater the “blowup factor” of the Vortex parameters, the more columns we get. And since we have a systematic code, it preserves the original messages on each row:
From there, the verifier samples a random challenge , which the prover uses to take a linear combination of all the encoded rows:
In the paper the non-encoded is sent instead, and the verifier encodes it. This is equivalent to interpolating and checking that it’s a polynomial of degree at most (which is what the implementation does):
We know that interpolates to the linear combination of all polynomials. i.e. .
As such, if we want to check that for some we have
we can check that
The verifier can then query random columns, and check that the linear combination is correctly formed for the matching entry
Implementation note: as the verifier needs to dynamically index into the linear combination vector , it places the vector values in an indexed lookup table and retrieves its entries by performing table lookups.
The verifier obviously wants to verifiably query the encoded matrix in an efficient way (without having to know the whole matrix). To do that, the RS-encoded matrix is committed to (using a Merkle tree), and only the columns that are queried are sent to the verifier (accompanied by merkle proofs):
A note on Ringsis modulo
The RingSIS implementation is mostly a wrapper over a core implementation. Although it does contain a few more helper functions, including a function to compute the result of the hash modulo the polynomial.
The function’s goal is to get the result of the multiplication , which is the same as:
for some quotient polynomial
notice that is the vanishing polynomial for the -th roots of unity, so for any -th root of unity we have:
so we can get evaluations of by evaluating the left-hand side on all -th roots of unity, then interpolate (since is of degree , we just need evaluations).
Using SmartVectors for Optimizing Mathematical Operations
We further provide some background on smartvectors, an important abstraction used from optimizing math operations in Linea’s implementation.
The Linea prover defines a specialized abstraction over vectors to optimize both memory consumption and performance when working with vectors of field elements. This abstraction is smartvectors, which are implemented by four different types designed to handle specific common vector types. The types of smartvectors are Regular
, Constant
, Rotated
, and PaddedCircularWindow
.
Each smartvector type implements specific functions such as retrieving its length, accessing a specific element, producing a subvector, and rotating the vector.
Regular represents a normal vector (i.e., slice) of field elements. This is the default type, and every other smartvector implements a function to convert and create a regular smartvector.
type Regular []field.Element
Constant is a smartvector obtained by repeating the same value for the specified length. This implementation provides maximum memory optimization and speeds up some operations.
type Constant struct {
val field.Element
length int
}
Rotated works by abstractly storing an offset and only applying the rotation when the vector is written, sub-vectored, or evaluated for specific operations. This makes rotations essentially free.
type Rotated struct {
v Regular
offset int
}
PaddedCircularWindow represents a vector padded up to a certain length and possibly rotated. It combines optimizations from all previous smartvectors and implements extensive optimizations when performing arithmetic operations (c.f., processWindowedOnly
in windowed.go
).
type PaddedCircularWindow struct {
window []field.Element
paddingVal field.Element
totLen, offset int
}
The following figure visualizes the different types of smartvectors.
Usage and Benefits
Smartvectors are used for memory and performance optimizations. They play a crucial role in performing FFTs on the Linea prover, representing polynomials, and performing optimized operations such as batch interpolation by reducing computations through pre-processing constant vectors in advance.
The main operations involving smartvectors are defined in arithmetic_basic.go
, arithmetic_gen.go
, arithmetic_op.go
, fft.go
, and polynomial.go
. A critical function is processOperator
in arithmetic_gen.go
, which handles various operations and performs optimizations based on the type of input smartvectors.
Precautions and Testing
While smartvectors are designed for optimization purposes, they should be used with care. The API is designed to prevent mutating smartvectors directly; instead, users should produce new instances. During our audit, we looked for potential issues arising from API misuse, identifying corner cases that could cause unexpected behavior, and ensured that the API was clear and not prone to misuse. We also ran multiple test cases and tweaked the existing test suite to validate that smartvectors work as expected.
In summary, despite the complexity of some optimizations, the code is well-documented and thoroughly tested. During our audit, we identified the following issues: