Difference between revisions of "IC state manager"

From Internet Computer Wiki
Jump to: navigation, search
Line 39: Line 39:
  
 
[[File:State-types.svg|thumb]]
 
[[File:State-types.svg|thumb]]
 +
 +
An object of type <code>ReplicatedState</code>, i.e., a representative of one of the equivalence classes in <code>ReplicatedState_R</code>, is created whenever the deterministic state machine implemented by the message routing and the execution layers calls the <code>commit_and_certify</code> function on the state manager. This state can then be internally converted to <code>CanonicalState</code> to be provided to other replicas within the state sync protocol or to be further converted to <code>PartialCanonicalState</code> and, e.g., made available to block makers on other subnets who need to include the subnet-to-subnet streams in blocks.
 +
 +
To have a compact and easy to transfer representation that is useful to verify the integrity or compare different versions of <code>(Partial)CanonicalState</code>, it can be converted into a <code>StateHashTree</code>. The conversion function is a mapping that needs to be collision resistant, meaning that, for some <code>cs1 : (Partial)CanonicalState</code> corresponding to some <code>h : StateHashTree</code> it is intractable to find a different <code>cs2 : (Partial)CanonicalState</code> that hashes to the same <code>h</code>. Intuitively, this means that a replica that gets hold of some <code>h : StateHashTree</code> can verify whether a certain piece of <code>cs : (Partial)CanonicalState</code> actually corresponds to the respective <code>h</code>. If <code>h</code> is authentic, one can use this mechanism to verify the authenticity of <code>c</code>.
 +
 +
From a <code>StateHashTree</code>, one can derive a <code>Digest</code>, representing a <code>StateHashTree</code> even more compactly, as well as a <code>Witness</code>. Deriving the latter requires additional information on the <code>(Partial)CanonicalState</code> the witness should be valid for, i.e., so that one can use the <code>Witness</code> and the <code>(Partial)CanonicalState</code> to recompute the <code>Digest</code> corresponding to the original <code>StateHashTree</code>.

Revision as of 07:54, 3 November 2022

Overview

The Internet Computer (IC) achieves its security and fault tolerance by replicating computation across node machines located in various independent data centers across the world. For scalability reasons, the Internet Computing Protocol (ICP) composes the IC of multiple independent subnets. Each subnet can be viewed as an independent replicated state machine that replicates its state over a subset of all the available nodes.

Roughly speaking, replication is achieved by having the two lower ICP layers (P2P & Consensus) agree on blocks containing batches of messages to be executed, and then having the two upper ICP layers (Message Routing & Execution) execute them. Blocks are organized as a chain, where each block builds on the previous block. Each Block has an associated height in the chain and one can look at execution of a batch of messages corresponding to the agreed upon Block at height [math]\displaystyle{ x }[/math] by the upper layers as taking the replicated state of version [math]\displaystyle{ x-1 }[/math], and "applying" the batch to it to obtain replicated state of version [math]\displaystyle{ x }[/math].

The state manager is the component that maintains a versioned repository of replicated state and takes care of its certification so that other stakeholders can verify the authenticity of (parts of) the state. This page presents an abstract model of the state manager and the replicated state, and formally defines how the abstract model and the implementation need to relate to each other. We also rely on this model to define the behavior of other components that modify the replicated state, e.g., message routing.

The state manager also provides capabilities to sync state with other replicas in the same subnet so that replicas that have fallen behind can catch up. The latter is not yet covered in this page.

Remarks and Required Prior Knowledge

  • The goal of this document is to provide the next level of detail compared to the material available for message routing in the "How it works" section of internetcomputer.org. So it is recommended to study the material available there first.
  • We do not give a comprehensive type definition of the replicated state in this document, but leave the definition of the relevant parts to the components that modify the state, e.g., message routing. Sometimes we may rely on definitions made there.
  • The documentation provided in this page may slightly deviate from the current implementation in terms of API as well as naming of functions, variables, etc. However, it still conveys the high-level ideas required to understand how the component itself works and how it interacts with other components.

Versioning of State

The state is versioned in a way that is in line with the batch numbers of the batches processed by the deterministic state machine. In particular this means that a state labeled with height [math]\displaystyle{ h }[/math] is exactly the state resulting from a processing cycle of the deterministic state machine processing a batch h using the replicated state labeled with Height [math]\displaystyle{ h - 1 }[/math]. For an overview of such a processing cycle, we refer the reader to the page on the message routing layer.

Abstract State Representation

An implementation of the state manager distinguishes multiple representations of replicated state. In this section we give an abstract description of the various representations and discuss the requirements we need to impose on their relations.

  • ReplicatedState, which is the implementation’s internal representation of replicated state.
  • (Partial)CanonicalState which is a representation of (parts of) replicated state, universally understandable across implementations.
  • StateHashTree which is a structured digest of (parts of) canonical state, universally understandable across implementations.

The relation between the different representations is subsumed below. The space of instances of type ReplicatedState can be partitioned into multiple equivalence classes. The criterion for whether two ReplicatedState instances belong to the same equivalence class is whether or not they map to the same CanonicalState. More formally, we use the equivalence relation [math]\displaystyle{ R }[/math] defined as follows:

[math]\displaystyle{ (x, y) \in R \Longleftrightarrow \texttt{to_canonical}(x) = \texttt{to_canonical}(y). }[/math]

This equivalence relation defines a type ReplicatedState_R that contains all the equivalence classes [x] over ReplicatedState with respect to [math]\displaystyle{ R }[/math], i.e.,

[math]\displaystyle{ [x] = \{ y~|~y ∈ \texttt{ReplicatedState}~∧~(x, y) ∈ R \}. }[/math]

State-types.svg

An object of type ReplicatedState, i.e., a representative of one of the equivalence classes in ReplicatedState_R, is created whenever the deterministic state machine implemented by the message routing and the execution layers calls the commit_and_certify function on the state manager. This state can then be internally converted to CanonicalState to be provided to other replicas within the state sync protocol or to be further converted to PartialCanonicalState and, e.g., made available to block makers on other subnets who need to include the subnet-to-subnet streams in blocks.

To have a compact and easy to transfer representation that is useful to verify the integrity or compare different versions of (Partial)CanonicalState, it can be converted into a StateHashTree. The conversion function is a mapping that needs to be collision resistant, meaning that, for some cs1 : (Partial)CanonicalState corresponding to some h : StateHashTree it is intractable to find a different cs2 : (Partial)CanonicalState that hashes to the same h. Intuitively, this means that a replica that gets hold of some h : StateHashTree can verify whether a certain piece of cs : (Partial)CanonicalState actually corresponds to the respective h. If h is authentic, one can use this mechanism to verify the authenticity of c.

From a StateHashTree, one can derive a Digest, representing a StateHashTree even more compactly, as well as a Witness. Deriving the latter requires additional information on the (Partial)CanonicalState the witness should be valid for, i.e., so that one can use the Witness and the (Partial)CanonicalState to recompute the Digest corresponding to the original StateHashTree.