Manual Node Recovery Guide

From Internet Computer Wiki
Revision as of 19:28, 4 December 2025 by Andrew.battat (talk | contribs) (Add most of docs)
Jump to: navigation, search

This runbook describes what steps node providers need to take during an NNS recovery.

Security warning

⚠️⚠️⚠️ Don’t get tricked into compromising your nodes. Only complete a manual node recovery if all of the following conditions are met:

  • A subnet recovery is announced on the Internet Computer Status Page
  • The DFINITY team reached out on the dedicated Matrix channel #ic-node-providers-incident-response:matrix.org.
    • Only the DFINITY team is able to send messages on this channel. In case of an incident, permissions are adapted so that everyone can send messages.

Prerequisites

  • The recovery coordinator should have communicated with you the following:
    • The recovery parameters:
      • The VERSION: the commit ID of the recovery-GuestOS update image
      • The VERSION-HASH: the SHA256 sum of the recovery-GuestOS update image.
      • The RECOVERY-HASH: the SHA256 sum of the recovery.tar.zst
    • The node(s): which specific nodes managed by the NP/NO are part of the target subnet.
  • Obtain console access to all nodes you run that are part of the target subnet.
    • Note that the recovery can be completed from a physical console or from the node's remote BMC virtual console view.

Recovery Steps

For each node to recover, you should perform the following process.

Obtain console access

Again, note that the recovery can be completed from a physical console or from the node's remote BMC virtual console view.

screenshot

You should see the limited-console> prompt. Type help to see the full list of limited-console commands.

Initiate manual recovery

Type manual-recovery to initiate the manual recovery.

screenshot

You should then be taken to the manual recovery text-user-interface:

screenshot

Input recovery parameters

screenshot

Input the VERSION, VERSION-HASH, and RECOVERY-HASH provided by the recovery coordinator

Please take great care to type in the characters precisely. If a single character is wrong, the recovery will not succeed and you will have to restart.

Note: certain BMCs offer a Virtual Clipboard within the Console Controls to paste text to the console, which you may find useful.

screenshot

Confirm recovery parameters

screenshot

Please take a moment to verify that your inputted recovery parameters exactly match those provided by the recovery coordinator. Again, if a single character is wrong, the recovery will not succeed and you will have to restart.

Monitor the recovery process

Once you have initiated the recovery process, monitor the recovery logs.

screenshot

After ~30 seconds, you should see the log:


======================================================================== SUCCESS: Recovery completed successfully! ========================================================================

screenshot

screenshot

screenshot