Stellar Development Foundation releases update on node downtime episode
The Stellar blockchain faced a technical issue recently, one that caused the validator nodes to be unable to process transactions. However, the Stellar Development foundation engineering team was able to revamp the Horizon cluster API and SDF validators. In an effort to clarify any other concerns there may be, the organization has now released a report to provide more information.
As clarified by the blog post, Stellar’s network remained “online” with support for some validators who were not unaffected by the failure and were able to process transactions on the blockchain without problems. The SDF said,
“(…) which is just the way a decentralized network is intended to work, and many of those validators continued to publish archives that keep track of ledger history, and that allow the halted nodes to fill in gaps when they need to recover from downtime.”
Although Stellar did release a real-time updated status page, exchanges such as Bitfinex stopped withdrawals with XLM, as reported by its CTO Paolo Ardoino via Twitter. Bitstamp too temporarily halted deposits and withdrawals of XLM, stating,
“We’ve temporarily stopped $XLM deposits and withdrawals due to issues on the StellarOrg network. We are monitoring the situation and will keep you updated.”
Despite the report, however, the exact reason for the glitch is still unnamed. The preliminary investigation pointed to an initial problem caused by a ledger or an operation on a specific ledger. As per the SDF, the “majority” of nodes on the network did not experience the failure, but some operated by SDF and Lobstr encountered the issue.
Overall, the affected nodes sustained a downtime for around 10 hours. The failure was detected by the SDF’s infrastructure monitoring operation composed of Runscope and Prometheus alerts to which the team responded almost immediately.
The organization repeatedly stressed that the network was not affected or stopped. Validators in sync managed to continue processing transactions in less than 5 seconds. However, the SDF clarified,
“However, some nodes, including those run by SDF and Lobstr, ceased to process transactions for about 9 hours. If you access the network via one of the affected nodes, you were not able to access the network to submit transactions during that time. If, however, you rely on one of the many unaffected nodes, your network access continued unabated.”
During the shortfall, the history of some ledgers was lost, but the engineering team worked to reintegrate it with support from organizations within tier 1. These organizations were responsible for publishing the complete network history in anticipation of situations. It was estimated that about 43 ledgers or 5 minutes of network history needed to be reinstated, all of which were successfully executed by the SDF team.
Meanwhile, many block explorers were quick to express the network’s inability to process transactions for several hours on their public platform.
Anton Cashchin, a managing partner at U.K.-based CEX exchange, sounded the alarm when he tweeted that “several validators” went offline, leading Binance, Bitfinex, and Bitstamp to halt withdrawals.
Another crypto-enthusiast, Archon also tweeted regarding this matter.
— ? Archon (@archon_ch) April 6, 2021