Difference between revisions of "IC state manager"

From Internet Computer Wiki
Jump to: navigation, search
(Created page with "== Overview == The Internet Computer (IC) achieves its security and fault tolerance by replicating computation across node machines located in various independent data center...")
 
Line 13: Line 13:
 
* We do not give a comprehensive type definition of the replicated state in this document, but leave the definition of the relevant parts to the components that modify the state, e.g., [[IC message routing layer|message routing]]. Sometimes we may rely on definitions made there.
 
* We do not give a comprehensive type definition of the replicated state in this document, but leave the definition of the relevant parts to the components that modify the state, e.g., [[IC message routing layer|message routing]]. Sometimes we may rely on definitions made there.
 
* The documentation provided in this page may slightly deviate from the current implementation in terms of API as well as naming of functions, variables, etc. However, it still conveys the high-level ideas required to understand how the component itself works and how it interacts with other components.
 
* The documentation provided in this page may slightly deviate from the current implementation in terms of API as well as naming of functions, variables, etc. However, it still conveys the high-level ideas required to understand how the component itself works and how it interacts with other components.
 +
 +
=== Versioning of State ===
 +
The state is versioned in a way that is in line with the batch numbers of the batches processed by the deterministic state machine. In particular this means that a state labeled with height <math>h</math> is exactly the state resulting from a processing cycle of the deterministic state machine processing a batch h using the replicated state labeled with Height <math>h - 1</math>. For an overview of such a processing cycle, we refer the reader to the page on the [[IC message routing layer|message routing layer]].
 +
 +
== Abstract State Representation ==
 +
An implementation of the state manager distinguishes multiple representations of replicated state. In this section we give an abstract description of the various representations and discuss the requirements we need to impose on their relations.
 +
 +
* <code>ReplicatedState</code>, which is the implementation’s internal representation of replicated state.
 +
 +
* <code>(Partial)CanonicalState</code> which is a representation of (parts of) replicated state, universally understandable across implementations.
 +
 +
* <code>StateHashTree</code> which is a structured digest of (parts of) canonical state, universally understandable across implementations.
 +
 +
The relation between the different representations is subsumed below. The space of instances of type <code>ReplicatedState</code> can be partitioned into multiple equivalence classes. The criterion for whether two <code>ReplicatedState</code> instances belong to the same equivalence class is whether or not they map to the same <code>CanonicalState</code>. More formally, we use the equivalence relation <math>R</math> defined as follows:
 +
 +
<math>
 +
(x, y) \in R \Longleftrightarrow \texttt{to_canonical}(x) = \texttt{to_canonical}(y).
 +
</math>
 +
 +
This equivalence relation defines a type <code>ReplicatedState_R</code> that contains all the equivalence classes <code>[x]</code> over <code>ReplicatedState</code> with respect to <math>R</math>, i.e.,
 +
 +
<math>
 +
[x] = \{ y~|~y ∈ \texttt{ReplicatedState}~∧~(x, y) ∈ R  \}.
 +
</math>
 +
 +
[[File:State-types.svg|thumb]]

Revision as of 07:46, 3 November 2022

Overview

The Internet Computer (IC) achieves its security and fault tolerance by replicating computation across node machines located in various independent data centers across the world. For scalability reasons, the Internet Computing Protocol (ICP) composes the IC of multiple independent subnets. Each subnet can be viewed as an independent replicated state machine that replicates its state over a subset of all the available nodes.

Roughly speaking, replication is achieved by having the two lower ICP layers (P2P & Consensus) agree on blocks containing batches of messages to be executed, and then having the two upper ICP layers (Message Routing & Execution) execute them. Blocks are organized as a chain, where each block builds on the previous block. Each Block has an associated height in the chain and one can look at execution of a batch of messages corresponding to the agreed upon Block at height [math]\displaystyle{ x }[/math] by the upper layers as taking the replicated state of version [math]\displaystyle{ x-1 }[/math], and "applying" the batch to it to obtain replicated state of version [math]\displaystyle{ x }[/math].

The state manager is the component that maintains a versioned repository of replicated state and takes care of its certification so that other stakeholders can verify the authenticity of (parts of) the state. This page presents an abstract model of the state manager and the replicated state, and formally defines how the abstract model and the implementation need to relate to each other. We also rely on this model to define the behavior of other components that modify the replicated state, e.g., message routing.

The state manager also provides capabilities to sync state with other replicas in the same subnet so that replicas that have fallen behind can catch up. The latter is not yet covered in this page.

Remarks and Required Prior Knowledge

  • The goal of this document is to provide the next level of detail compared to the material available for message routing in the "How it works" section of internetcomputer.org. So it is recommended to study the material available there first.
  • We do not give a comprehensive type definition of the replicated state in this document, but leave the definition of the relevant parts to the components that modify the state, e.g., message routing. Sometimes we may rely on definitions made there.
  • The documentation provided in this page may slightly deviate from the current implementation in terms of API as well as naming of functions, variables, etc. However, it still conveys the high-level ideas required to understand how the component itself works and how it interacts with other components.

Versioning of State

The state is versioned in a way that is in line with the batch numbers of the batches processed by the deterministic state machine. In particular this means that a state labeled with height [math]\displaystyle{ h }[/math] is exactly the state resulting from a processing cycle of the deterministic state machine processing a batch h using the replicated state labeled with Height [math]\displaystyle{ h - 1 }[/math]. For an overview of such a processing cycle, we refer the reader to the page on the message routing layer.

Abstract State Representation

An implementation of the state manager distinguishes multiple representations of replicated state. In this section we give an abstract description of the various representations and discuss the requirements we need to impose on their relations.

  • ReplicatedState, which is the implementation’s internal representation of replicated state.
  • (Partial)CanonicalState which is a representation of (parts of) replicated state, universally understandable across implementations.
  • StateHashTree which is a structured digest of (parts of) canonical state, universally understandable across implementations.

The relation between the different representations is subsumed below. The space of instances of type ReplicatedState can be partitioned into multiple equivalence classes. The criterion for whether two ReplicatedState instances belong to the same equivalence class is whether or not they map to the same CanonicalState. More formally, we use the equivalence relation [math]\displaystyle{ R }[/math] defined as follows:

[math]\displaystyle{ (x, y) \in R \Longleftrightarrow \texttt{to_canonical}(x) = \texttt{to_canonical}(y). }[/math]

This equivalence relation defines a type ReplicatedState_R that contains all the equivalence classes [x] over ReplicatedState with respect to [math]\displaystyle{ R }[/math], i.e.,

[math]\displaystyle{ [x] = \{ y~|~y ∈ \texttt{ReplicatedState}~∧~(x, y) ∈ R \}. }[/math]

State-types.svg