0x Staking System - Epoch 67 Finalization

Summary

The ZRX staking system cannot advance beyond epoch 67 until a patch is deployed due to an issue in the epoch finalization logic.

  • Staked ZRX is NOT at risk.
  • User funds are NOT at risk.
  • We have two potential paths forward.

At the end of each 7 day epoch, an epoch finalization transaction must be executed to pay out accumulated protocol fees (ETH) to each active staking pool before initiating the next epoch. The staking system assumes that the number of active staking pools for a given epoch is equivalent to the number of pools that generate a 0x protocol fee. However, the staking system incorrectly assumes that the 0x protocol fee will always be greater than zero (because the gas price will always be greater than zero). During the past epoch, a number of 0x trades were mined with zero gas price. One of these transactions caused the staking system to double count a staking pool, believing the pool hadn’t been initialized prior. This non-existent pool could not be finalized, so the epoch could not be finalized until a fix is deployed.

Solution #1: Hot fix, ZRX locked for 2 weeks

Hot fix. The epoch remains stuck and ZRX cannot be moved until a hot fix is deployed and passes the 2 week time lock. At this point, the issue is fixed and cannot happen again. The next epoch is initiated. Protocol fees will continue to be collected as usual during the 2 week time lock period, resulting in a single extended epoch that is ~3 weeks long.

Pros
All staking pools remain intact.
All staked ZRX remains delegated.

Cons
Users cannot move their staked ZRX for 2 additional weeks.

Solution #2: Shutdown and redeploy the staking system

Pros
Users may immediately withdraw their staked ZRX.

Cons
All staking pools must be redeployed. Costly.
All users must withdraw and re-delegate their ZRX. Costly.
Accounting to determine payouts from old staking contract is very complex.

Our recommendation

We have already started working on solution #1, which in our opinion minimizes disruption and overall community cost of the operation. We will communicate when we are ready with the hotfix for this approach. Triggering an emergency shutdown feels like overkill since no funds are at risk.

Relevant Code

The issue exists in the payProtocolFee (https://github.com/0xProject/protocol/blob/development/contracts/staking/contracts/src/fees/MixinExchangeFees.sol) function. Here is the relevant code block:

if (feesCollectedByPool == 0) {
    // Compute member and total weighted stake.
    (uint256 membersStakeInPool, uint256 weightedStakeInPool) = _computeMembersAndWeightedStake(poolId, poolStake);
    poolStatsPtr.membersStake = membersStakeInPool;
    poolStatsPtr.weightedStake = weightedStakeInPool;

    // Increase the total weighted stake.
    aggregatedStatsPtr.totalWeightedStake = aggregatedStatsPtr.totalWeightedStake.safeAdd(weightedStakeInPool);

    // Increase the number of pools to finalize.
    aggregatedStatsPtr.numPoolsToFinalize = aggregatedStatsPtr.numPoolsToFinalize.safeAdd(1);

    // Emit an event so keepers know what pools earned rewards this epoch.
    emit StakingPoolEarnedRewardsInEpoch(currentEpoch_, poolId);
}

This code block is only intended to run once per pool per epoch. It incorrectly assumes that the protocol fee paid is always greater than 0, and therefore will always increment the numPoolsToFinalize when this code is executed.

Each pool must be finalized a single time by calling finalizePool (https://github.com/0xProject/protocol/blob/development/contracts/staking/contracts/src/sys/MixinFinalizer.sol#L90), which will decrement numPoolsToFinalize when it fully executes. Once each pool has been finalized, the epoch can advance if numPoolsToFinalize is 0. However if the above code block executed more than once, numPoolsToFinalize will be larger than it should be and will never be decremented to 0. This will indefinitely prevent the epoch from advancing.

Transactions which caused the issue
(poolId 23/0000000000000000000000000000000000000000000000000000000000000017):
0 gas price Ethereum Transaction Hash (Txhash) Details | Etherscan
0 gas price Ethereum Transaction Hash (Txhash) Details | Etherscan
0 gas price Ethereum Transaction Hash (Txhash) Details | Etherscan
0 gas price Ethereum Transaction Hash (Txhash) Details | Etherscan
0 gas price Ethereum Transaction Hash (Txhash) Details | Etherscan
1st tx with non-zero gas price for the epoch Ethereum Transaction Hash (Txhash) Details | Etherscan

Patch Diff:

The patch to prevent this issue from occurring again is easy enough.

diff --git a/contracts/staking/contracts/src/fees/MixinExchangeFees.sol b/contracts/staking/contracts/src/fees/MixinExchangeFees.sol
index 63f3aacf8..6eba6d199 100644
--- a/contracts/staking/contracts/src/fees/MixinExchangeFees.sol
+++ b/contracts/staking/contracts/src/fees/MixinExchangeFees.sol
@@ -53,6 +53,10 @@ contract MixinExchangeFees is
     {
         _assertValidProtocolFee(protocolFee);

        if (protocolFee == 0) {
            return;
        } 

         // Transfer the protocol fee to this address if it should be paid in
         // WETH.
         if (msg.value == 0) {

In addition, some of the corrupted state will need to be reset. We are working on and testing a proposal, which we expect to be ready on May 3.

9 Likes

Solution #1 hot fix seems better overall than redeploying the entire staking system. Leaning on upgradability in order to maintain both history and the current stake position is the better solution. It’s very likely that the majority of ZRX stakers do not move their stake every 2 weeks anyways. So the 2 week lock up is bound to only affect a smaller % of stakers. Whereas solution #2 may likely affect every single ZRX staker.

Flashbots has become very popular the last few months. Zero gas price transactions typically come from traders capturing either a sandwich trade or an arb. They pay a fee to the miner directly, and the tx shows up with a 0 Gwei gas price. This essentially bypasses the protocol fee entirely, which is not ideal.

10 Likes

Thanks for the transparency, detailed explanation, and presentation of options @_wwarren (and the rest of the 0x team) and for quickly figuring this out. I agree with @VolleyFire that given the current information we have, the Solution #1 hotfix is far better than the alternative. Thinking a bit further down the road, IMO the emergence and increased usage of flashbots is something of “canary in a coalmine” given that the 0x protocol fee is directly related to the price of gas. Combined with the progress towards a multi-chain model, I’m of a mind that we should accelerate our efforts to re-visit tokenomics, and I intend to strongly advocate for prioritizing work related to this through the 0xDAO.

7 Likes

I support solution #1. In general users staking ZRX are long term so add this additional cost on solution #2 in times of high gas fees don’t seem a good approach, as the funds are not at risk.

7 Likes

The patch for option #1 has been merged, deployed, and submitted to the 0x Governor contract to begin the 2 week timelock. The PR with the patch can be found here.

This patch:

  • Prevents the issue from occurring again by making payProtocolFee no-op when the protocol fee is 0.
  • Resets numPoolsToFinalize in epoch 67 to 0.
  • Advances to epoch 69.
  • Finalizes all pools in epoch 68, just in case the same issue were to occur for a pool that has not yet collected protocol fees while waiting for the 2 week timelock to pass.
  • Sets numPoolsToFinalize in epoch 68 to 0.

The patched staking implementation contract can be found on mainnet here.

Note that this bug also resulted in all payouts for epoch 67 being diluted, as the stake for the “ghost” pools was counted multiple times. However, the excess rewards roll over to future epochs and will result in a larger payout for epoch 68.

8 Likes