Best practices for a high traffic dapp launch

From Internet Computer Wiki
Revision as of 14:46, 27 February 2023 by Ais (talk | contribs)
Jump to: navigation, search

Summary

Starting the project

Choosing a programming language[1]

The IC is a Wasm-based programming environment so any language that can compile to Wasm can be deployed and instantiated as a canister. In practice, there is a lot of tooling, libraries, and boilerplate that make the following two languages the most baked choices for programming on the IC: Motoko (programming language) and Rust .

Motoko

  • Strongly typed, statically compiled language (also common to Rust)
  • Motoko is purpose-built for the Internet Computer; as such, it will coevolve with the IC itself (unlike Rust).
  • Motoko is designed to be readable and familiar to JavaScript developers (unlike Rust).
  • Motoko aims to be a natural choice for application-layer developers.

Rust

  • Strongly typed, statically compiled language (also common to Motoko).
  • Rust aims to be a natural choice for a broad class of system-layer developers (unlike Motoko).
  • Because of its broad use for systems development, some of its capabilities may not be very useful for developing IC apps.
  • Rust is about 10 years old; consequently, documentation and libraries are mature (unlike Motoko).


In practice, Since Rust is a more mature language, the current experience is that Rust dapps scale more than Motoko-based dapps (the reason has to do with ongoing improvements with Motoko garbage collection). In practice, Motoko is an easier language to use. This is of course not a new trade-off for developers, it is 'analogous' to how C may be a more performant language than Ruby, but Ruby is easier to use.

In short, if you already know Rust, then going with Rust makes sense. If you are a JavaScript developer, Motoko might be an easier path. If you expect extremely high traffic, Rust may be a better option. If you are still learning IC development, Motoko may be a better option while Motoko performance improves under the hood. You can find Motoko examples in the dfinity/examples repository.

Language Resources

Writing the dapp

Getting help

There are few resources where developers get help while building a dapp:

Testing the dapp

Stress testing

The IC does not have a testnet. Instead developers are encouraged to use the mainnet for test purposes. Before developers deploy the production ready dapp to mainnet and enable access to it to the world, they should first deploy test versions to mainnet. They should then stress it by emulating the anticipated load on it and see how well the dapp and the platform in general fares. If the desired performance is not achieved, developers are encouraged to get help by using the links provided above. Often times, the desired performance can be achieved by performing minor tweaks to the dapps.

Deploying the dapp

Launching the dapp to the world

Monitoring the dapp

Lessons from past launches

Motoko version

Motoko is a language being actively developed so it is recommended to use the latest version of Motoko to get maximal stability and performance.

Inter-canister calls[2]

The Internet Computer system provides inter-canister communication that follows the actor model: Inter-canister 'calls' are implemented via two asynchronous 'messages', one to initiate the call, and one to return the response. Canisters process messages atomically (and roll back upon certain error conditions), but not complete calls. This makes programming with inter-canister calls error-prone. Possible common sources for bugs, vulnerabilities or simply unexpected behavior are:

  • Reading global state before issuing an inter-canister call, and assuming it to still hold when the call comes back.
  • Changing global state before issuing an inter-canister call, changing it again in the response handler, but assuming nothing else changes the state in between (reentrancy).
  • Changing global state before issuing an inter-canister call, and not handling failures correctly, e.g. when the code handling the callback rolls backs.

If you find such a pattern in your code, you should analyze if a malicious party can trigger them, and assess the severity that effect. These issues apply to all canisters and are not Motoko-specific.

Rollbacks[2]

Even in the absence of inter-canister calls the behavior of rollbacks can be surprising. In particular, rejecting (i.e. throw) does 'not' rollback state changes done before, while trapping (e.g. Debug.trap, assert…, out of cycle conditions) does.

Therefore, one should check all public update call entry points for unwanted state changes or unwanted rollbacks. In particular, look for methods (or rather, messages, i.e. the code between commit points) where a state change is followed by a throw).

This issues apply to all canisters, and are not Motoko-specific, although other CDKs may not turn exceptions into rejects (which don’t roll back).

Talking to malicious canisters[2]

Talking to untrustworthy canisters can be risky, for the following (likely incomplete) reasons:

  • The other canister can withhold a response. Although the bidirectional messaging paradigm of the Internet Computer was designed to guarantee a response 'eventually', the other party can busy-loop for as long as they are willing to pay for before responding. Worse, there are ways to deadlock a canister.
  • The other canister can respond with invalidly encoded Candid. This will cause a Motoko-implemented canister to trap in the reply handler, with no easy way to recover. Other CDKs may give you better ways to handle invalid Candid, but even then you will have to worry about Candid cycle bombs that will cause your reply handler to trap.

Many canisters do not even do inter-canister calls, or only call other trustworthy canisters. For the others, the impact of this needs to be carefully assessed.

Time is not 'strictly' monotonous[2]

The timestamps for “current time” that the Internet Computer provides to its canisters is guaranteed to be monotonous, but not 'strictly' monotonous. It can return the same values, even in the same messages, as long as they are processed in the same block. It should therefore not be used to detect “happens-before” relations.

Instead of using and comparing timestamps to check whether Y has been performed after X happened last, introduce an explicit var y_done : Bool state, which is set to False by X and then to True by Y. When things become more complex, it will be easier to model that state via an enumeration with speaking tag names, and update this “state machine” along the way.

Another solution to this problem is to introduce a var v : Nat counter that you bump in every update method, and after each await. Now v is your canister’s state counter, and can be used like a timestamp in many ways.

While talking about time: The system time (typically) changes across an await. So if you do {{{1}}} and then await, the value in now may no longer be what you want.

Wrapping arithmetic[2]

The Nat64 data type, and the other fixed-width numeric types provide opt-in wrapping arithmetic (e.g. +%, fromIntWrap). Unless explicitly required by the current application, this should be avoided, as usually a too large or negative value is a serious, unrecoverable logic error, and trapping is the best one can do.

Shadowing of msg or caller[2]

Don’t use the same name for the “message context” of the enclosing actor and the methods of the canister: It is dangerous to write shared(msg) actor, because now msg is in scope across all public methods. As long as these also use public shared(msg) func …, and thus shadow the outer msg, it is correct, but it if one accidentally omits or mistypes the msg, no compiler error would occur, but suddenly msg.caller would now be the original controller, likely defeating an important authorization step.

Instead, write shared(init_msg actor or {{{1}}} actor to avoid using msg.

Possible Attacks

Cycle balance drain attacks[2]

Because of the IC’s “canister pays” model, all canisters are prone to DoS attacks by draining their cycle balance, and this risk needs to be taken into account.

The most elementary mitigation strategy is to monitor the cycle balance of canisters and keep it far from the (configurable) freezing threshold.

On the raw IC-level, further mitigation strategies are possible:

  • If all update calls are authenticated, perform this authentication as quickly as possible, possibly before decoding the caller’s argument. This way, a cycle drain attack by an unauthenticated attacker is less effective (but still possible).
  • Additionally, implementing the canister_inspect_message system method allows the above checks to be performed before the message even is accepted by the Internet Computer. But it does not defend against inter-canister messages and is therefore not a complete solution.
  • If an attack from an authenticated user (e.g. a stakeholder) is to be expected, the above methods are not effective, and an effective defense might require relatively involved additional program logic (e.g. per-caller statistics) to detect such an attack, and react (e.g. rate-limiting).
  • Such defenses are pointless if there is only a single method where they do not apply (e.g. an unauthenticated user registration method). If the application is inherently attackable this way, it is not worth the bother to raise defenses for other methods.

Related: A justification why the Internet Identity does not use canister_inspect_message)

A Motoko-implemented canister currently cannot perform most of these defenses: Argument decoding happens unconditionally before any user code that may reject a message based on the caller, and canister_inspect_message is not supported. Furthermore, Candid decoding is not very cycle defensive, and one should assume that it is possible to construct Candid messages that require many instructions to decode, even for “simple” argument type signatures.

The conclusion for the audited canisters is to rely on monitoring to keep the cycle balance up, even during an attack, if the expense can be born, and maybe pray for IC-level DoS protection to kick in.

Large data attacks[2]

Another DoS attack vector exists if public methods allow untrustworthy users to send data of unlimited size that is persisted in the canister memory. Because of the translation of async-await code into multiple message handlers, this applies not only to data that is obviously stored in global state, but also local data that is live across an await point.

The effectiveness of such attacks is limited by the Internet Computer’s message size limit, which is in the order of a few megabytes, but many of those also add up.

The problem becomes much worse if a method has an argument type that allows a Candid space bomb: It is possible to encode very large vectors with all values null in Candid, so if any method has an argument of type [Null] or [?t], a small message will expand to a large value in the Motoko heap.

Other types to watch out:

  • Nat and Int: This is an unbounded natural number, and thus can be arbitrarily large. The Motoko representation will however not be much larger than the Candid encoding (so this does not qualify as a space bomb).

It is still advisable to check if the number is reasonable in size before storing it or doing an await. For example, when it denotes an index in an array, throw early if it exceeds the size of the array; if it denotes a token amount to transfer, check it against the available balance, if it denotes time, check it against reasonable bounds.

Principal: A Principal is effectively a Blob. The Interface specification says that principals are at most 29 bytes in length, but the Motoko Candid decoder does not check that currently (fixed in the next version of Motoko). Until then, a Principal passed as an argument can be large (the principal in msg.caller is system-provided and thus safe). If you cannot wait for the fix to reach you, manually check the size of the princial (via Principal.toBlob) before doing the await.

References

Template:Reflist