Checking node CPU and memory speed

From Internet Computer Wiki
Revision as of 16:31, 26 June 2024 by Andrew.battat (talk | contribs) (Created page with "Some server machines run slower than they should, and they may also become slower after certain events (such as power loss) due to firmware bugs, they may have a faulty power...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Some server machines run slower than they should, and they may also become slower after certain events (such as power loss) due to firmware bugs, they may have a faulty power supply, insufficient power supply redundancy, etc.. If you suspect that this is the case, you can run the following test on the machine. You can prepare a live Ubuntu USB stick and boot the server from it. Make sure you don't install Ubuntu on the machine and wipe the disks, since you will have to redeploy your node if you do this. You only want to try Ubuntu.

Once you boot from the live Ubuntu image, you can install some packages to it. They will live in memory only and will be gone once you reboot the machine. The test that we found particularly valuable to determine if the problem is present was sysbench. Install it with sudo apt install sysbench and then

sysbench --test=memory run

on the machine (HostOS) and look at the memory transfer speed. Memory speed should be at least 5.6GB/s. If you get less than that, please consult your vendor how to increase the speed to the appropriate level. For instance, with some Dell servers we were seeing 2.6GB/s memory speed and had to upgrade the CPLD firmware to resolve the performance issue. For some SuperMicro servers we have seen improvements by power cycling the server & changing the BIOS setting, Advanced > ACPI Settings > ACPI SRAT L3 Cache As NUMA Domain to Disabled.