Difference between revisions of "The Internet Computer for Computer Scientists"

From Internet Computer Wiki
Jump to: navigation, search
 
(13 intermediate revisions by one other user not shown)
Line 32: Line 32:
 
of these replicas are faulty and these faults may be Byzantine.
 
of these replicas are faulty and these faults may be Byzantine.
 
(Note that the different subnets in the IC may have different sizes.)
 
(Note that the different subnets in the IC may have different sizes.)
[TODO: say something about static vs adaptive, pro-active?]
 
  
 
Protocols for consensus and for implementing replicated state machines also typically make
 
Protocols for consensus and for implementing replicated state machines also typically make
 
assumptions about the <b>communication model</b>, which characterizes the ability of an adversary  
 
assumptions about the <b>communication model</b>, which characterizes the ability of an adversary  
 
to delay the delivery of messages between replicas.
 
to delay the delivery of messages between replicas.
At opposite ends of the spectrum, we have the following models:
+
At opposite ends of the spectrum, there are the following models:
* In the <b>synchronous model</b>, there exists some known finite time bound <math>\Delta</math>, such that for any message sent, the adversary can delay its delivery by at most <math>\Delta</math>.
+
* In the <b>synchronous model</b>, there exists some known finite time bound <math>\delta</math>, such that for any message sent, it will be delivered in less than time <math>\delta</math>.
 
* In the <b>asynchronous model</b>, for any message sent, the adversary can delay its delivery by any finite amount of time, so that there is no bound on the time to deliver a message; however, each message must <b>eventually</b> be delivered.
 
* In the <b>asynchronous model</b>, for any message sent, the adversary can delay its delivery by any finite amount of time, so that there is no bound on the time to deliver a message; however, each message must <b>eventually</b> be delivered.
  
Line 49: Line 48:
 
Such partial synchrony models can be formulated in various ways.
 
Such partial synchrony models can be formulated in various ways.
 
The partial synchrony assumption used by the IC says, roughly speaking,
 
The partial synchrony assumption used by the IC says, roughly speaking,
that for each subnet, communication among replicas in that subnet is synchronous for short periods of time
+
that for each subnet, communication among replicas in that subnet is periodically synchronous for short intervals of time;
every now and then;
+
moreover, the synchrony bound <math>\delta</math> does not need to be known in advance.
moreover, the synchrony bound <math>\Delta</math> does not need to be known in advance.
 
 
This partial synchrony assumption is only needed to ensure that the consensus protocol makes progress
 
This partial synchrony assumption is only needed to ensure that the consensus protocol makes progress
 
(the so-called liveness property) &mdash; it is not needed to ensure correct behavior of consensus
 
(the so-called liveness property) &mdash; it is not needed to ensure correct behavior of consensus
Line 59: Line 57:
 
on the number of faults is optimal.
 
on the number of faults is optimal.
  
Related topics (TODO):
+
Related topics:
* Canisters
+
* [[IC architecture overview]]
* The NNS
+
* [[IC canisters]]
* ICP tokens
+
* [[IC network nervous system (NNS)]]
* Use of threshold sigs
+
* [[IC threshold signatures and distributed key generation]]
* Deep dive into consensus
+
* [[IC crossnet communication]]
* Deep dive into P2P
+
* [[IC catchup packages]]
* Deep dive into Execution and Message Routing
+
* [[IC fault model]]
* deep dive into CUPs
+
* [[IC tokens]]

Latest revision as of 14:12, 27 February 2023

To a first approximation, the IC (Internet Computer) is a network of replicated state machines.

Each replicated state machine comprises a subnet of replicas. Subnets may communicate with one another, but otherwise they operate (for the most part) independently of each other.

As in any replicated state machine, a series of transaction requests is processed. A transaction request may come from either an external client or from another state machine in the IC. The replicas in a subnet must run a consensus protocol to order the incoming transaction requests, so that each replica processes the transaction requests in the same order. Each replica processes the transaction requests in the agreed-upon order. In processing a transaction request, each replica will update its internal state according to a deterministic function that maps the pair (current state, transaction request) to a new state. Because all replicas in a subnet process transaction requests in the same order, their internal states will evolve over time in exactly the same way. In response to processing a transaction, a subnet may also generate an outgoing message, which can be sent to either an external client or to another state machine in the IC.

The reason for using a replicated state machine, rather than just a single state machine, is to achieve fault tolerance: a subnet should continue to function — and to function correctly — even if some replicas are faulty. Generally in this area, one considers two types of replica failures: crash failure and Byzantine failures. A crash failure occurs when a replica abruptly stops and does not resume. Byzantine failures are failures in which a replica may deviate in an arbitrary way from its prescribed protocol. Moreover, with Byzantine failures, one or possibly several replicas may be directly under the control of a malicious adversary. Of the two types of failures, Byzantine failures are potentially far more disruptive.

Protocols for consensus and for implementing replicated state machines typically make assumptions about how many replicas may be faulty and to to what degree (crash or Byzantine) they may be faulty. In the IC, the assumption is that if a given subnet has [math]\displaystyle{ n }[/math] replicas, then less than [math]\displaystyle{ n/3 }[/math] of these replicas are faulty and these faults may be Byzantine. (Note that the different subnets in the IC may have different sizes.)

Protocols for consensus and for implementing replicated state machines also typically make assumptions about the communication model, which characterizes the ability of an adversary to delay the delivery of messages between replicas. At opposite ends of the spectrum, there are the following models:

  • In the synchronous model, there exists some known finite time bound [math]\displaystyle{ \delta }[/math], such that for any message sent, it will be delivered in less than time [math]\displaystyle{ \delta }[/math].
  • In the asynchronous model, for any message sent, the adversary can delay its delivery by any finite amount of time, so that there is no bound on the time to deliver a message; however, each message must eventually be delivered.


Since the replicas in an IC-subnet are typically distributed around the globe, a synchronous communication model would be highly unrealistic. In this setting, the most robust model is the asynchronous model. Unfortunately, there are no known consensus protocols in this model that are truly practical. So like most other practical Byzantine fault tolerant systems that cannot rely on synchronous communication, the IC opts for a compromise: a partial synchrony communication model. Such partial synchrony models can be formulated in various ways. The partial synchrony assumption used by the IC says, roughly speaking, that for each subnet, communication among replicas in that subnet is periodically synchronous for short intervals of time; moreover, the synchrony bound [math]\displaystyle{ \delta }[/math] does not need to be known in advance. This partial synchrony assumption is only needed to ensure that the consensus protocol makes progress (the so-called liveness property) — it is not needed to ensure correct behavior of consensus (the so-called safety property), nor is it needed anywhere else in the IC protocol stack.

Under the assumption of partial synchrony and Byzantine faults, it is known that our bound of [math]\displaystyle{ f \lt n/3 }[/math] on the number of faults is optimal.

Related topics: