|
|
(33 intermediate revisions by 5 users not shown) |
Line 1: |
Line 1: |
− | The job of the consensus layer of the IC is to order transaction requests so that all replicas in a subnet will process
| + | Please see [https://internetcomputer.org/how-it-works/consensus/ How Consensus Layer works] |
− | such requests in the same order.
| |
− | There are many protocols in the literature for this problem.
| |
− | The IC uses a new consensus protocol, which is described here at a high level.
| |
− | For more details, see the paper https://eprint.iacr.org/2021/1330 (in particular, Protocol ICC1 in that paper).
| |
− | | |
− | Any secure consensus protocol should guarantee two properties, which (roughly stated) are:
| |
− | * <b>safety</b>: all replicas in fact agree on the same ordering of transaction requests, and
| |
− | * <b>liveness</b>: all replicas should make steady progress.
| |
− | The paper https://eprint.iacr.org/2021/1330 proves that the IC consensus protocol satisfies both of these properties
| |
− | | |
− | The IC consensus protocol is design to be
| |
− | * extremely simple, and
| |
− | * robust (performance degrades gracefully when some replicas are malicious).
| |
− | | |
− | As discussed [[The Internet Computer for Computer Scientists|in the introduction]], we assume
| |
− | * a subnet of <math>n</math> replicas, and
| |
− | * at most <math>f < n/3</math> of the replicas are faulty.
| |
− | Faulty replicas may exhibit arbitrary, malicious (i.e., Byzantine) behavior.
| |
− | To simplify the presentation, we assume here that <math>n=3f+1</math>.
| |
− | | |
− | | |
− | We assume that communication is <b>asynchronous</b>, with no <i>a priori</i>
| |
− | bound on the delay of messages sent between replicas.
| |
− | In fact, the scheduling of message delivery may be completely under adversarial control.
| |
− | The IC consensus protocol guarantees safety under this very weak communication assumption.
| |
− | However, to guarantee liveness, we need to assume a form of <b>partial synchrony</b>,
| |
− | which (roughly stated) says that the network will be periodically <b>timely</b> .
| |
− | Somewhat more precisely, there exists a bound <math>\Delta</math> such that
| |
− | periodically all undelivered
| |
− | messages will be delivered within time <math>\Delta</math>.
| |
− | The bound <math>\Delta</math> does not have to be known in advance
| |
− | (the protocol is initialized with a bound, but will dynamically
| |
− | adapt and increase this bound if it is too small).
| |
− | Regardless of whether we are assuming an asynchronous
| |
− | or a partially synchronous network,
| |
− | we assume that every message sent from one honest
| |
− | party to another will <i>eventually</i> be delivered.
| |
− | | |
− | Like a number of consensus protocols,
| |
− | the IC consensus protocol is based in a [[wikipedia:blockchain|blockchain]].
| |
− | As the protocol progresses, a tree of blocks is grown,
| |
− | starting from a special <i>genesis block</i> that is the root of the tree.
| |
− | Each non-genesis block in the tree contains (among other things)
| |
− | a <i>payload</i>, consisting of a sequence of transaction requests,
| |
− | and a hash of the block's parent in the tree.
| |
− | The honest replicas have a consistent view of this tree:
| |
− | while each replica may have a different, partial view
| |
− | of this tree, all the replicas have a view of the <i>same</i> tree.
| |
− | In addition, as the protocol progresses,
| |
− | there is always a path of <i>finalized</i> blocks in this tree.
| |
− | Again, the honest replicas have a consistent view of this path:
| |
− | while each replica may have a different, partial view
| |
− | of this path, all the parties have a view of the <i>same</i> path.
| |
− | The transaction requests in the payloads of the blocks along this
| |
− | path are the ordered transaction requests will be processed by the [[IC execution layer|execution layer]]
| |
− | of the Internet Computer.
| |
− | | |
− | The protocol proceeds in <b>rounds</b>.
| |
− | In round <math>h</math> of the protocol,
| |
− | one or more blocks of <b>height</b> <math>h</math> are added to the tree.
| |
− | That is, the blocks added in round <math>h</math> are always at a distance
| |
− | of exactly <math>h</math> from the root.
| |
− | In each round, a pseudo-random process is used to assign
| |
− | each replica a unique <b>rank</b>, which is an
| |
− | integer in the range <math>0,\ldots,h-1</math>.
| |
− | This is pseudo-random process is implemented using a <b>random beacon</b> (see below).
| |
− | The party of lowest rank is the <b>leader</b> of that round.
| |
− | When the leader is honest and the network is timely,
| |
− | the leader will propose a block, which will be added to the tree;
| |
− | moreover, this will be the <i>only</i> block added to the tree
| |
− | in this round and it will extend the finalized path.
| |
− | If the leader is not honest or the network is not timely,
| |
− | some other parties of higher rank may also propose blocks,
| |
− | and also have their blocks added to the tree.
| |
− | In any case, the logic of the protocol gives highest priority
| |
− | to the leader's proposed block
| |
− | and <i>some</i> block or blocks will be added to this tree in this round.
| |
− | | |
− | When the network is timely,
| |
− | the message complexity in a given round of the IC consensus protocol is <math>O(n^2)</math>
| |
− | with overwhelming probability.
| |
− | The protocol also enjoys the property known as <b>optimistic responsiveness</b>,
| |
− | which means that
| |
− | so long as the leader is honest, the protocol will proceed at the pace
| |
− | of the actual network delay, rather than some upper bound on
| |
− | the network delay.
| |
− | | |
− | | |
− | To implement the protocol, each replica is associated with a
| |
− | public key for the [[wikipedia:BLS_digital_signature|BLS signature scheme]], and each replica also holds
| |
− | the corresponding secret signing key.
| |
− | The association of replicas to public keys is maintained by the
| |
− | [[IC network nervous system (NNS)|network nervous system (NNS) of the Internet Computer]].
| |
− | These BLS signatures will be used to authenticate messages, also called <b>artifacts</b>,
| |
− | sent by replicas.
| |
− | The protocol also uses the <b>signature aggregation</b> feature of BLS signatures,
| |
− | which allows many signatures on the same message to be aggregated into
| |
− | a compact multi-signature.
| |
− | FIXME: citation, discussion of rogue key attack mitigation via PoPs.
| |
− | | |
− | <b>Random beacon.</b>
| |
− | In addition to BLS signatures and multi-signatures as discussed above,
| |
− | the protocol makes use of a BLS [[wikipedia:Threshold_cryptosystem|threshold signature scheme]] to implement the
| |
− | above-mentioned <b>random beacon</b>.
| |
− | The random beacon for height <math>h</math> is a <math>(f+1)</math>-threshold signature
| |
− | on a message unique to height <math>h</math>.
| |
− | In each round of the protocol, each replica broadcasts its share of the beacon for
| |
− | the next round, so that when the next round begins, all replicas should have enough shares
| |
− | to reconstruct the beacon for that round.
| |
− | As discussed above, the random beacon at height <math>h</math> is used to
| |
− | assign a pseudo-random rank to each replica that will be used in round <math>h</math> of the protocol.
| |
− | Because of the security properties of the threshold signature,
| |
− | an adversary will not be able to predict the ranking of the replicas more than one round in advance,
| |
− | and these rankings will effectively be as good as random.
| |
− | To set up such a threshold signature scheme, the Internet Computer runs a
| |
− | [[wikipedia:Distributed_key_generation|distributed key generation (DKG)]] protocol.
| |
− | There are many DKG protocols in the literature.
| |
− | The Internet Computer implements a new [[non-interactive DKG (NiDKG) protocol]].
| |
− | | |
− | | |
− | <b>Block making.</b>
| |
− | Each replica may at different points in time play the role of a <b>block maker</b>.
| |
− | As a block maker in round <math>h</math>,
| |
− | the replica proposes a block <math>B</math> of height <math>h</math>
| |
− | that to be child of a block <math>B'</math> of height <math>h-1</math>
| |
− | in the tree of blocks.
| |
− | To do this, the block maker first gathers together a <b>payload</b>
| |
− | consisting of all transaction requests it knows about
| |
− | (but not including those already included in payloads in blocks in the
| |
− | path through the tree ending at <math>B'</math>).
| |
− | The block <math>B</math> consists of
| |
− | * the payload,
| |
− | * the hash of <math>B'</math>,
| |
− | * the rank of the block maker,
| |
− | * the height <math>h</math> of the block.
| |
− | After forming the block <math>B</math>, the block maker forms
| |
− | a <b>block proposal</b>, consisting of
| |
− | * the block <math>B</math>,
| |
− | * the block maker's identity <math>\textit{BMID}</math>, and
| |
− | * the block maker's signature on <math>B</math>.
| |
− | A block maker will broadcast its block proposal to all other replicas.
| |
− | | |
− | <b>Notarization.</b>
| |
− | A block is only added to the tree of blocks when it becomes <b>notarized</b>.
| |
− | For a block to become notarized, it must be notarized by <math>2f+1</math>
| |
− | distinct replicas.
| |
− | To notarize a proposed block <math>B</math> at height <math>h</math>,
| |
− | a replica will verify a number of conditions,
| |
− | and if these tests pass, the replica will notarize the block
| |
− | by broadcasting a <b>notarization share</b> for <math>B</math>,
| |
− | consisting of
| |
− | * the hash of <math>B</math>,
| |
− | * the height <math>h</math> of <math>B</math>,
| |
− | * the identity <math>\textit{NID}</math> of the notarizing replica, and
| |
− | * the notarizing replica's signature on a message comprising the hash of <math>B</math> and the height <math>h</math>.
| |
− | Any set of <math>2f+1</math> notarization shares on <math>B</math>
| |
− | may be aggregated together to form a <b>notarization</b> for <math>B</math>,
| |
− | consisting of
| |
− | * the hash of <math>B</math>,
| |
− | * the height <math>h</math> of <math>B</math>,
| |
− | * a set of <math>2f+1</math> distinct identities <math>NIDSet</math> of the notarizing replicas, and
| |
− | * an aggregate signature of the notarizing replicas in <math>NIDSet</math> on a message comprising the hash of <math>B</math> and the height <math>h</math>.
| |
− | Each replica will finish round <math>h</math>
| |
− | as soon as it obtains a notarized block of height <math>h</math>,
| |
− | and will not subsequently issue any notarization shares at height <math>h</math>.
| |
− | | |
− | <b>Finalization.</b>
| |
− | There may be more than one notarized block at a given height <math>h</math>.
| |
− | However, if a block is <b>finalized</b>, then we can be sure that there
| |
− | is no other notarized block at height <math>h</math>.
| |
− | Let us call this the <b>finalization invariant</b>.
| |
− | One consequence of the finalization invariant is the following.
| |
− | Suppose one replica sees a finalized block <math>B</math> at height <math>h</math>,
| |
− | and that another replica sees a finalized block
| |
− | <math>B'</math> at height <math>h' \ge h</math>.
| |
− | Then the finalization invariant implies that the path in the tree of notarized blocks
| |
− | ending at <math>B</math> is a prefix of the path ending at <math>B</math>
| |
− | (if not, then there would be two notarized blocks at height <math>h</math>,
| |
− | contradicting the finalization invariant).
| |
− | | |
− | For a block to become finalized, it must be finalized by <math>2f+1</math>
| |
− | distinct replicas.
| |
− | Recall that round <math>h</math> ends for a replica when
| |
− | it obtains a notarized block <math>B</math> of height <math>h</math>.
| |
− | At that point in time, such a replica will broadcast a
| |
− | <b>finalization share</b> for <math>B</math> if it has not previously
| |
− | issued a notarization share for any block other than <math>B</math>.
| |
− | A finalization share has exactly the same format as a notarization
| |
− | (but are tagged in such a way notarization shares
| |
− | and finalization shares cannot be confused with one another).
| |
− | Any set of <math>2f+1</math> finalization shares on <math>B</math>
| |
− | may be aggregated together to form a <b>finalization</b> for <math>B</math>,
| |
− | which has exactly the same format as a notarization.
| |
− | | |
− | <!--
| |
− | One can easily see that the logic for generating finalization shares
| |
− | implies that the above finalization invariant must hold.
| |
− | Here is a proof:
| |
− | # Suppose that block <math>B</math> is finalized but another block <math>B'</math> is also notarized.
| |
− | # Suppose that the number of corrupt parties is exactly <math>f' \le f</math>.
| |
− | # Let <math>N</math> be the set of honest replicas that notarized <math>B</math> and let <math>N'</math> be the set of honest replicas that notarized <math>B'</math>.
| |
− | # By the security properties of aggregate signatures, each of the sets <math>N</math> and <math>N'</math> must have size at least at least <math>(2f+1)-f'</math>.
| |
− | # A finalization on <math>B</math> means a set <math>S</math> of <math>(2f+1)-f=f+1</math> honest replicas issued a finalization share on <math>B</math> (by the security properties of BLS signature aggregation).
| |
− | # Each replica in <math>S</math> issued a notarization share on no other block besides <math>B</math>.
| |
− | -->
| |