Current limitations of the Internet Computer
This article contains a list of the common limitations that anyone developing on the IC should be familiar with. In some cases, it is not clear what is the best way to address the limitations and further design discussions are needed before a protocol or tool improvement can be developed.
Bugs in pre_upgrade hooks
If there is a bug in your pre_upgrade hook that causes it to panic, then the canister can no longer be upgraded. This is because the pre_upgrade hook is part of the currently deployed wasm module and the system will always execute it before deploying the new wasm module and if the pre_upgrade hook fails, then the system will fail the whole upgrade.
Currently there is no good mitigation around this issue other than urging developers to make sure that their pre_upgrade is bug free by doing a lot of testing.
Long running upgrades
Generally speaking, when a canister is being upgraded, the logic in the pre_upgrade hook serialises state from the wasm heap to stable memory and the logic in the post_upgrade hook deserialises it from stable memory back to wasm heap. There is an instructions bound on how long the upgrade process can run for. So it is possible that if the canister has too much state or the [de]serialising logic is not very efficient, then the whole process does not finish in time.
The recommended mitigation here is to ensure that the state that needs to be persisted across upgrades does not exceed what the canister can [de]serialise during the upgrade process.
[de]serialiser requiring additional wasm memory
Related issue in Motoko. Generally speaking, it is possible that the serialising logic requires some additional wasm heap to run. Let’s say that the canister has 3.5GiB of wasm heap and the serialising logic requires an additional 600MiB to serialise the data, given that the wasm heap is limited to 4GiB, the upgrade process will again fail. Note that this issue will also be present for canisters written in Rust.
The recommended mitigation here is to again ensure that the state that needs to be persisted across upgrades does not exceed what the canister can [de]serialise during the upgrade process.
Only upgrading stopped canisters
Generally speaking, it is only safe to upgrade stopped canisters i.e. canisters that do not have any outstanding responses. This is because it is possible that the new wasm module will not be compatible with the response that refers to an earlier state of the canister and could corrupt the current state of the canister. The current implementation of the IC allows canisters that are not stopped to be upgraded (with the assumption being that the canister developer has taken sufficient precautions to ensure the above mentioned corruption cannot happen).
Calling potentially malicious or buggy canisters can prevent canisters from upgrading
If a canister can only be safely upgraded when it is stopped, it can run into the follow issue. If a canister `A` sends a `Request` to another canister `B` and `B` is buggy or malicious, it can create a situation where it does not send a `Response` back to `A` for arbitrarily long time. `B` can keep the call context alive if it keeps sending out `Request`s of its own. As long as `A` has an outstanding `Response`, upgrading it may not be safe.
Loops in call graphs
The current implementation of the IC allows canisters to be called in a loop. For example, A can call B, B can call C, and C can call A. There can be subtle issues when one attempts to do canister management in a loop though. If A sends a request (Req1) to B, while processing Req1 from A, B sends a StopCanister message to stop A, the canisters will effectively deadlock. When the StopCanister message is processed, A will be put in the Stopping state. It will only fully stop, when it has received a response for Req1. It will not receive a response for Req1 till B gets a response for its StopCanister message, and B will not get that response till A is stopped.