Thursday, December 7, 2023

Miner Autopause Vulnerability Disclosure

Core-Geth Security Vulnerability Update

TLDR

Miners restarting and reconnecting to the network sufficiently quickly (between network block production) are exposed to potential DoS attacks pausing their mining operations.

Solutions to this issue are:

Upgrade to Core-Geth 1.12.17.
A workaround: Allowing a time between restarts, enough for the network to produce at least one new block, or generally around 15-30 seconds. Block production intervals follow a long-tail distribution. Over the last 12 hours:
- 99% of blocks have been produced in less than 69 seconds,
- 75% in 19 seconds or less,
- and 25% have been produced in 4 seconds or less.
- The average block time is about 13 seconds.

Summary

On November 14, 2023, user @Ewemek reported unexpected intermittent pausing of their mining work commits.

Investigation of this issue led to our discovery of an unhandled edge case in a previous security vulnerability resolution.

In October 2020 go-ethereum implemented a security patch preventing possible DoS attacks on miners wherein an attacker could feed miners fake TDs, causing their mining operations to pause while attempting to download a non-existing better chain.

The resolution of this concern was built around the downloader sync process, and assumed that the local node would need to initially sync with the network to bring it to the best available chain.

The rationale for understanding Issue 586, however, showed that this is not always the case. Miners may, on a quick restart, have a node which on startup is already synced with the canonical network if no new blocks have been produced in the time it took to restart and reconnect with the network, whereby the downloader would not need to complete a synchronization for the local node to be at the network's canonical head.

This edge case was unhandled by the previous patch, and resulted in the same potential DoS scenario as before.

It Was Me

The report of Issue 586 coincided with my development of a node crawler I had been working on.

With this, I had (stupidly) thought it would be a good idea to advertise an arbitrarily high total difficulty to try to make friends on the network.

That wasn't a good idea mostly because it didn't serve its purpose at all, but also because I was able to knock some miners off their rig for a few minutes. Sorry @Ewemek.

Of course once I understood the correlation I immediately modified the behavior of the explorer to behave politely instead, and thankfully @Ewemek was clever enough to figure out the workaround even before that.

Resolution

Our patch (which is now only relevant to ETC; ETH is using PoS and has no miners) resolving this edge case uses an additional event fetcher.ChainInsertEvent which the miner listens for, and upon receiving it, disables its auto-pause-on-sync logic.

The BlockFetcher system, which causes this event to be emitted, handles the reception of NewBlock (0x07) and NewBlockHashes (0x01) block propagation protocol messages.

On these messages, the BlockFetcher attempts to append the received or fetched block to the local chain, a process which can only succeed if that block builds directly on top of the local chain, and thus indicates a network-canonical synced status.

In this way the miner is able to use the fetcher's block insertion operation as indicative of complete synchronization relative to her peers.

This logic corresponds with pre-existing logic for transaction pool message handling, which also depends on complete chain synchronization.

References

Node does not commit new sealing work when it should: https://github.com/etclabscore/core-geth/issues/586
Core-Geth 1.12.17.
Miner: exit loop when downloader Done or Failed: https://github.com/ethereum/go-ethereum/pull/21653
Miner: don't interrupt mining after successful sync: https://github.com/ethereum/go-ethereum/pull/21701
Tupari (v1.19.23): https://github.com/ethereum/go-ethereum/releases/tag/v1.9.23
- Geth v1.9.23 is a maintenance release containing security fixes. This update is recommended for all users.
- Security issues fixed in this release:
- Mining no longer stops due to sync after the first successful sync round
Iberian Sun (v1.11.16): https://github.com/etclabscore/core-geth/releases/tag/v1.11.16