Audit of Aleo Upgradability Update

June 14th, 2025 • Technical Report

Introduction

Between June 1st and June 14th, zkSecurity was tasked by Provable to audit the upcoming “Program Upgradability” update for Aleo. Two consultants worked over two weeks to review the codebase for potential security issues and provide design feedback and recommendations.

Prior to the engagement, the team spent two weeks getting acquainted with the SnarkVM codebase and the Aleo ecosystem. In the last several days of the audit, we also verified the fixes for all findings reported in this assessment.

Aleo Program Upgradability

Prior to this upgrade all Aleo programs were immutable: a program was deployed once and could not be modified or upgraded hereafter. With the upcoming “Program Upgradability” update, programs may now change after initial deployment. This has interesting implications throughout the system which may have relied on the immutability of programs up until this point, we explore these and note a number of security implications/considerations with the proposed design and implementation (prior to the release).

Glossary

For the reader’s convenience, we include a brief glossary of central terms used within the Aleo SnarkVM:

Program : Collection of functions, mappings, records, closures. Called a “contract” in other systems.
Program ID : Unique program identifier, composed of a name (e.g. example.aleo) and a network identifier.
Transition : Call to a single function in a program.
Execution : Sequence of transitions, for the root call and any internal calls.
Deployment : Deployment of a new program or (after this update) an upgrade of an existing program.
Transaction : Execution or a deployment.
Constructor (new) : Function run during deployment; restricts upgradability of the program.

Constructors

This upgrade introduces the ability to upgrade programs on-chain, by redeploying them. Every time a program is upgraded, the edition of the program must increment by one, prior to the update, the edition was not exposed to the snarkVM and internally fixed to zero. When and under which conditions a program can be upgraded is controlled by a method added to all newly deployed programs called “the constructor”.

For instance, the following constructor disallows any upgrades, by requiring that the edition of the new program to be zero:

program example.aleo;

constructor:
  assert.eq example.aleo/edition 0u16;

Note that the constructor is also run during the initial deployment of the program, and that the constructor above can only be satisfied during the initial deployment. As a result, note that e.g.

program example.aleo;

constructor:
  assert.eq false true;

Is an undeployable program.

The following constructor requires that program and any upgrades are deployed by a specific address:

program example.aleo;

constructor:
  assert.eq example.aleo/program_owner <ADDRESS>;

Constructors have access to the mappings of a program and hence the rules for upgrading a program can be controlled dynamically by manipulating the mappings using the other functions in the program. However, the constructor itself is immutable and cannot be modified or upgraded. All legacy programs, which do not have a constructor are immutable.

Permissible Upgrades

Upgrades are only allowed to expand or leave unchanged the interface of a program, e.g. by only adding new functions or new mappings – which can be read externally. This is important to avoid breaking any dependent program which call methods of the upgraded program: such programs would not “type” after the upgrade, referencing e.g. functions which no longer exist. Note however that functions can be “functionally” deleted by making them trivially unsatisfiable, and may otherwise change their behavior in arbitrary ways and thus there is no guarantee that the dependent program will remain satisfiable after the upgrade.

Program Owner

The program owner is the address which deployed (the latest edition of) a program. Depending on the constructor logic, this party may have special privileges and the program owner is used in the constructor to identify the party deploying the upgrade, allowing the constructor to check if the party is eligible to deploy the program. Cryptographically, the program owner is bound to the deployment by signing the “deployment id” which is meant to uniquely identify the program being deployed in the transition.

Findings

Below are listed the findings found during the engagement. High severity findings can be seen as so-called "priority 0" issues that need fixing (potentially urgently). Medium severity findings are most often serious findings that have less impact (or are harder to exploit) than high-severity findings. Low severity findings are most often exploitable in contrived scenarios, if at all, but still warrant reflection. Findings marked as informational are general comments that did not fit any of the other criteria.

ID	Component	Name	Risk
#00	snarkVM	Unstable Program Load Order Can Stall Node Bootup	High
#01	snarkVM	Edition of a Deployment is Malleable	High
#02	snarkVM	`Operand::Edition` and `Operand::Checksum` Can Return Stale Values in the Function Scope	Medium
#03	snarkVM	No explicit binding between requests and programs	Medium
#04	synthesizer/src/vm/finalize.rs	Program Upgrade and Constructor Execution In Deployment Transaction Are Not Atomic	Medium
#05	synthesizer/process/src/finalize.rs	Non-Deterministic Constructor Execution	Low

#00 - Unstable Program Load Order Can Stall Node Bootup

Severity: High Location: snarkVM

During node bootup, programs are loaded in order of block height. However, within a single block, the load order of multiple programs is not stable. This instability can cause loading failures and stall node bootup.

/// Initializes the VM from storage.
#[inline]
pub fn from(store: ConsensusStore<N, C>) -> Result<Self> {
    [...]
    // Retrieve the list of deployment transaction IDs and their associated block heights.
    let deployment_ids = transaction_store.deployment_transaction_ids().collect::<Vec<_>>();
    let mut deployment_ids = cfg_into_iter!(deployment_ids)
        .map(|transaction_id| {
            // Retrieve the height.
            let height =
                match block_store.find_block_hash(&transaction_id)?.map(|hash| block_store.get_block_height(&hash))
                {
                    Some(Ok(Some(height))) => height,
                    _ => {
                        bail!("Block height for deployment transaction '{transaction_id}' is not found in storage.")
                    }
                };
            Ok((transaction_id, height))
        })
        .collect::<Result<Vec<_>>>()?;
    // Sort the deployment transaction IDs by their block heights.
    deployment_ids.sort_unstable_by(|(_, a), (_, b)| a.cmp(b));

    // Load the deployments in order of their block heights.
    const PARALLELIZATION_FACTOR: usize = 256;
    for (i, chunk) in deployment_ids.chunks(PARALLELIZATION_FACTOR).enumerate() {
        // Load the deployments.
        let deployments = cfg_iter!(chunk)
            .map(|(transaction_id, _)| {
                // Retrieve the deployment from the transaction ID.
                match transaction_store.get_deployment(transaction_id)? {
                    Some(deployment) => Ok(deployment),
                    None => bail!("Deployment transaction '{transaction_id}' is not found in storage."),
                }
            })
            .collect::<Result<Vec<_>>>()?;
        // Add the deployments to the process.
        // Note: This iterator must be serial, to ensure deployments are loaded in the order of their dependencies.
        deployments.iter().try_for_each(|deployment| process.load_deployment(deployment))?;
    }
    [...]
}

SnarkVM enforces restrictions on finalize_cost and number_of_calls for programs, which are checked during program initialization. If Program B imports and calls Program A, an upgrade to Program A may cause Program B’s functions to exceed these restrictions. This is not an issue during deployment execution, since Program B is not re-checked after Program A is upgraded. However, during node bootup, every program is re-checked, and because the program load order within a block is not stable, Program B may be loaded after Program A’s upgrade. This can trigger a restriction check failure and prevent the node from booting.

Example sequence:

Block 1: Deploy Program A.
Block 2: Deploy Program B (which imports A) and upgrade Program A, increasing its calls or finalize instructions.
During node bootup, when loading programs in Block 2, if Program B is loaded after Program A’s upgrade, the restriction check fails and node bootup is stalled.

Recommendation

It is recommended to load programs in the order of their transaction index within each block during node bootup to ensure a stable and deterministic load order.

Client Response

The client fixed this by sorting the deployment transaction according to the block height and transaction index.

let mut deployment_ids = cfg_into_iter!(deployment_ids)
        .map(|transaction_id| {
            // Retrieve the block hash for the deployment transaction ID.
            let Some(hash) = block_store.find_block_hash(&transaction_id)? else {
                bail!("Deployment transaction '{transaction_id}' is not found in storage.")
            };
            // Retrieve the height.
            let Some(height) = block_store.get_block_height(&hash)? else {
                bail!("Block height for deployment transaction '{transaction_id}' is not found in storage.")
            };
            // Get the corresponding block's transactions.
            let Some(transactions) = block_store.get_block_transactions(&hash)? else {
                bail!("Transactions for deployment transaction '{hash}' is not found in storage.")
            };
            // Find the index of the deployment transaction ID in the block's transactions.
            let Some(index) = transactions.transactions().get_index_of(transaction_id.deref()) else {
                bail!("Transaction for deployment transaction '{transaction_id}' is not found in storage.")
            };
            Ok((transaction_id, (height, index)))
        })
        .collect::<Result<Vec<_>>>()?;

#01 - Edition of a Deployment is Malleable

Severity: High Location: snarkVM

When a program is deployed, the deployment structure contains an edition field that tracks the version number:

Deployment {
    edition: 0,
    program: PROGRAM_A,
    verification_keys: VKS_A,
    program_checksum: CHECKSUM_A,
}

The program owner signs this deployment, and subsequent upgrades increment the edition number. However, since the edition field is not included in the signature, an attacker can:

Take an old deployment with its valid signature.
Modify the edition number to be higher than the current version.
Redeploy the old program version with the manipulated edition number.

Attack

Consider this sequence of events:

Initial deployment (edition 0) with PROGRAM_A - signed by the program owner
Upgrade deployment (edition 1) with PROGRAM_B - signed by the program owner
Attacker takes the old deployment, changes edition to 2, and redeploys PROGRAM_A

This works, because the edition field is not included in the signature, and results in a potentially unauthorized (as defined by the constructor) rollback to older version. Note that the rollback must satisfy the conditions in check_upgrade_is_valid which means that the old version of the program PROGRAM_A must have the same interface as the new version PROGRAM_B; for instance, PROGRAM_B might be an updated version of PROGRAM_A which includes a bug fix, but otherwise has the same functionality. In the case where a program is only deployed once, an attacker can still cause a denial of service by redeploying the program with the maximum edition number u16::MAX, this makes the program unupgradable regardless of the conditions in the constructor.

Recommendation

Include the edition field in the program owner’s signature (or add it to the deployment id).

We recommend making the deployment id dependent on the contents of the whole deployment to avoid any possible mallability issues.

Client Response

The fix implemented by Provable changes the computation of the deployment_tree (of which the deployment_id is the root) into:

pub fn deployment_tree_v2(deployment: &Deployment<N>) -> Result<DeploymentTree<N>> {
    // Ensure the number of leaves is within the Merkle tree size.
    Self::check_deployment_size(deployment)?;

    // Compute a hash of the deployment bytes.
    let deployment_hash = N::hash_sha3_256(&to_bits_le!(deployment.to_bytes_le()?))?;

    // Prepare the header for the hash.
    let header = to_bits_le![deployment.version()? as u8, deployment_hash];

    // Prepare the leaves.
    let leaves = deployment.program().functions().values().enumerate().map(|(index, function)| {
        // Construct the transaction leaf.
        Ok(TransactionLeaf::new_deployment(
            u16::try_from(index)?,
            N::hash_bhp1024(&to_bits_le![header, function.to_bytes_le()?])?,
        )
        .to_bits_le())

    });

    // Compute the deployment tree.
    N::merkle_tree_bhp::<TRANSACTION_DEPTH>(&leaves.collect::<Result<Vec<_>>>()?)
}

Meaning every leaf in the tree (function), is bound to the hash deployment_hash of the entire deployment, which includes the edition (and the verification keys as well). This means that the owner signature is computed over the whole deployment as recommended.

#02 - `Operand::Edition` and `Operand::Checksum` Can Return Stale Values in the Function Scope

Severity: Medium Location: snarkVM

The new operands Operand::Edition and Operand::Checksum are designed to retrieve the edition and checksum of a given program. Currently, they are valid in both the function scope (off-chain execution) and the finalize scope (on-chain execution). Since the edition and checksum of a program can change after an upgrade, these operands are expected to always provide the latest values. However, in the function scope, they are assigned as constants in the circuit:

match operand {
    // If the operand is the checksum, retrieve the checksum from the stack.
    Operand::Checksum(program_id) => {
        let checksum = match program_id {
            Some(program_id) => *self.get_external_stack(program_id)?.program_checksum(),
            None => *self.program_checksum(),
        };
        Ok(circuit::Value::Plaintext(circuit::Plaintext::from(checksum.map(circuit::U8::constant))))
    }
    // If the operand is the edition, retrieve the edition from the stack.
    Operand::Edition(program_id) => {
        let edition = match program_id {
            Some(program_id) => *self.get_external_stack(program_id)?.program_edition(),
            None => *self.program_edition(),
        };
        Ok(circuit::Value::Plaintext(circuit::Plaintext::from(circuit::Literal::U16(
            circuit::U16::new(circuit::Mode::Constant, edition),
        ))))
    }
}

The verifying key of the circuit is fixed at program deployment. As a result, in the function scope, the edition and checksum values are also fixed. Even if another program is upgraded, these values remain unchanged, so the operands may return stale information. For example:

Deploy program foo.aleo, which retrieves the edition of bar.aleo using the Operand::Edition operand. The current edition of bar.aleo is 0.
Upgrade bar.aleo, increasing its edition to 1.
Call foo.aleo to get the edition of bar.aleo. It still returns 0, which is now outdated.

Recommendation

It is recommended to disallow the use of Operand::Edition and Operand::Checksum in the function scope to prevent returning stale values.

Client Response

The client opted to remove both the Operand::Edition and Operand::Checksum from the set of allowed operands for in-circuit Aleo instructions. They remain accessible from “finalize”, which can also be used to access them from within the circuit, should the user wish to: by witnessing these values in circuit and returning them to the finalize, which then ensures that the values exported from the function agree with Operand::Edition and Operand::Checksum as obtained in finalize.

#03 - No explicit binding between requests and programs

Severity: Medium Location: snarkVM

Aleo allows the delegation of SNARK computation to a third-party by creating “requests” which is subsequently proved by a third-party. These are signatures on the inputs/outputs of every function called (“transition”) during the execution of the transaction. Prior to the upgradability update, all programs were immutable meaning the transitions a request are guaranteed to lead to the execution of a specific and static set of instructions. With the upgradability update, programs can now be upgraded, since a request only signs the inputs/outputs of the transitions it calls and not the program itself, this means that there is no binding between a request and the current version of a program being invoked, or any of its dependencies.

This means that semantics of a request can change between the point of creation (when the user signs the request) and the point of execution (when the request is proved). This is most obvious when the callgraph remains the same but the functions in the callgraph are upgraded, however, because there is no explicit binding between a parent transition (with is_root = True) and its child transitions (from functions invoked by the parent transition), a malicious prover could theoretically “stitch together” a request which proves the execution of a newer version of the root program, e.g. the root program contains a function of the form:

import foo.aleo;

program bar.aleo;

function root:
  ...
  call foo.aleo/sub r0 into r1;
  ...

With one call from bar.aleo/root to foo.aleo/sub. The user creates two transactions, calling bar.aleo/root, this includes two transactions invoking foo.aleo/sub. The foo.aleo program is now upgraded to a newer version:

import foo.aleo;

program bar.aleo;

function root:
  ...
  call foo.aleo/sub r0 into r1;
  call foo.aleo/sub r2 into r3;
  ...

With two calls from bar.aleo/root to foo.aleo/sub. Note that this upgrade is allowed as the interfaces of both foo.aleo and bar.aleo remain unchanged. However, the malicious prover can still “stitch together” a request which proves the execution of a newer version of the root program assuming the two original calls to foo.aleo/sub have the same inputs as the calls in the new version of foo.aleo/root.

Recommendation

A request should be bound to the (versions of) all programs in the callgraph: by including a hash of all the checksums into the signature, ensuring that if any of the programs involved in the transaction changes, the request becomes invalid. Additionally, the monotonically increasing “edition” of all the programs should be included in the hash as well, such that an invalid request cannot become valid again at a later time due to a program rolling back to an older version. Observe that since the request is verified in-circuit, this requires feeding the hash as public input to the circuit. For security, this hash is only required to be included for the root transition.

We believe that this is the most straightforward semantics for the user to reason about.

Client Response

The chosen mitigation is different from the suggested mitigation above, but successfully mitigates the issue as well. The signature of every request is computed over a message which includes:

The checksum of the program to which the called function belongs.
The root_tvk (the root transition view key).

The checksum is exposed from the SNARK as public input directly, where as the root_tvk is exposed from the SNARK as public input indirectly via the scm which is a commitment to the root_tvk:

let root_tvk = root_tvk.unwrap_or(tvk);
let scm = N::hash_psd2(&[signer.deref().to_x_coordinate(), root_tvk])?;

This means that the root_tvk acts as a per-authorization (consisting of multiple transitions/requests) nonce. This serves to bind each request to a unique authorization, this prevents “cut-and-paste” attacks: where a malicious prover constructs a new valid authorization from requests in multiple different authorizations. Observe, that this alone does not prevent “cut” attacks, where a malicious prover might try to simply remove requests from an authorization.

The overall security argument is then fairly straightforward:

The checksum of a request uniquely identifies the program and its version.
Since the Aleo VM is deterministic, the set of child calls is uniquely determined by:
- The checksum of the parent
- The arguments to the parent

Applying this observation inductively over the callgraph from the root, we conclude that the execution is uniquely identified by the set of checksums. Finally, the set of checksums cannot be maulled across different authorizations because the checksum of each request is bound to the authorization via a unique nonce, the root_tvk, which separates the domain of signatures across different authorizations. Since the set of calls is uniquely determined, removing any request from an honestly produced authorization, which will have exactly one signed request per call in the callgraph, as determined by the arguments to the root function and the set of checksums, would result in a call with a missing request.

#04 - Program Upgrade and Constructor Execution In Deployment Transaction Are Not Atomic

Severity: Medium Location: synthesizer/src/vm/finalize.rs

During a program upgrade, snarkVM first executes the program constructor and then replaces the old program code (Stack) with the new code. Ideally, the replacement of Stack should happen immediately after constructor execution to ensure atomicity. However, currently, these steps are performed separately: the program replacement occurs at the end of the entire block execution. As a result, other transactions can be executed in between. This lack of atomicity allows these transactions to read the updated map (modified by the constructor) but still see the old edition/checksum of the program.

For example:

Program foo.aleo imports bar.aleo and reads its edition and map.
The upgrade transaction for bar.aleo is executed; the constructor runs and updates the map.
A transaction for foo.aleo is executed, which reads the updated map but the old edition of bar.aleo (since the program Stack has not been replaced yet).
The old program bar.aleo is then replaced with the new version.

Recommendation

It is recommended to replace the program Stack immediately after executing the constructor to ensure atomicity.

Client Response

The client fixed this by replacing the program Stack immediately after executing the constructor for the deployment transaction.

#05 - Non-Deterministic Constructor Execution

Severity: Low Location: synthesizer/process/src/finalize.rs

For deployments, in the snarkVM’s constructor finalization process the transition ID from the fee transition is used to seed the ChaCha random number generator. Since there is no cryptographic binding between the deployment transition (in particular, the program_owner signature) and the fee transition paying for the deployment, an attacker can manipulate the “randomness” produced by chacha.rand inside the constructor finalize by creating a new transaction with a different fee transitions while reusing the same deployment.

During program deployment, when a constructor exists, the system executes:

if deployment.program().contains_constructor() {
    let operations = finalize_constructor(state, store, &stack, *fee.transition_id())?;
    finalize_operations.extend(operations);
    lap!(timer, "Execute the constructor");
}

The fee.transition_id() is passed to the constructor finalization process and subsequently used in the ChaCha random number generator’s seed computation. The seed preimage includes the transition ID as a key component:

let preimage = if (ConsensusVersion::V1..=ConsensusVersion::V2).contains(&consensus_version) {
    to_bits_le![
        registers.state().random_seed(),
        **registers.transition_id(),  // This comes from fee.transition_id()
        stack.program_id(),
        registers.function_name(),
        self.destination.locator(),
        self.destination_type.type_id(),
        seeds
    ]
} else {
    // Similar structure with additional nonce field
}

Attack

An attacker can exploit this vulnerability through the following steps:

Create Initial Deployment: The attacker creates a legitimate program deployment transaction with a constructor that uses rand.chacha operations.
Extract Deployment Transition: The attacker extracts the deployment transition from the original transaction, leaving it completely unchanged.
Generate New Fee Transitions: The attacker creates multiple new transactions, each containing: - The same unchanged deployment transition - A different fee transition with a different transition ID
Grind ChaCha.Rand Outputs: By controlling the fee transition ID, the attacker can influence the ChaCha seed and potentially affect the execution of the constructor.

Recommendation

The constructor execution should be deterministic based upon:

The new deployment.
The current program state.

To achieve this, the constructor should use a deterministic seed that cannot be manipulated by changing fee transitions.

Two obvious solutions exists:

Use Deployment ID: Replace fee.transition_id() with the deployment ID instead.
Use Constant Seed: Use a constant transition ID for all deployment finalizations.

Client Response

The client decided to seed chacha.rand using the default transition ID.

← Back to Index