add sips with stubs and links to new loc

This commit is contained in:
Jason Schrader
2021-02-15 09:39:56 -07:00
parent 5878d46bca
commit 30b3715319
9 changed files with 26 additions and 8058 deletions

View File

@@ -1,848 +1,5 @@
# SIP-000 Stacks Improvement Proposal Process
# Preamble
This document formerly contained SIP-000 before the Stacks 2.0 mainnet launched.
Title: Stacks Improvement Proposal Process
Author: Ken Liao <yukanliao@gmail.com>, Jude Nelson <jude@blockstack.com>
Status: Draft
Consideration: Governance
Type: Meta
Created: 2020-06-23
License: BSD-2-Clause
Sign-off:
# Abstract
A Stacks Improvement Proposal (SIP) is a design document that provides
information to the greater Stacks ecosystem's participants concerning the design
of the Stacks blockchain and its ongoing operation. Each SIP shall provide a
clear and concise description of features, processes, and/or standards for the
Stacks blockchain and its operators to adopt, with sufficient details provided
such that a reasonable practitioner may use the document to create an
independent but compatible implementation of the proposed improvement.
SIPs are the canonical medium by which new features are proposed and described,
and by which input from the Stacks ecosystem participants is collected. The SIP
Ratification Process is also described in this document, and provides the means
by which SIPs may be proposed, vetted, edited, accepted, rejected, implemented,
and finally incorporated into the Stacks blockchain's design, governance, and
operational procedures. The set of SIPs that have been ratified shall
sufficiently describe the design, governance, and operationalization of the
Stacks blockchain, as well as the means by which future changes to its official
design, implementation, operation, and governance may be incorporated.
# License and Copyright
This SIP is made available under the terms of the BSD-2-Clause license,
available at https://opensource.org/licenses/BSD-2-Clause. This SIPs copyright
is held by the Stacks Open Internet Foundation.
# Specification
Each SIP shall adhere to the same general formatting and shall be ratified
through the processes described by this document.
## Introduction
Blockchains are unique among distributed systems in that they also
happen to encode a social contract. By running a blockchain node, a user
implicitly agrees to be bound to the social contract's terms embedded within the
blockchain's software. These social contracts are elaborate constructions that
contain not only technical terms (e.g. "a block may be at most 1MB"), but also
economic terms (e.g. "only 21 million tokens may exist") and social terms (e.g.
"no money can leave this account" or "this transaction type was supported
before, but will now be ignored by the system") which the user agrees to uphold
by running a blockchain node.
It stands to reason that the Stacks blockchain is made of more than just
software; it is also made of the people who run it. As such, the act of
developing and managing the Stacks blockchain network includes the act of
helping its people coordinate and agree on what the blockchain is and what it
should do. To this end, this document proposes a process by which the Stacks
blockchain's users can conduct themselves to be the stewards of the blockchain
network in perpetuity.
The goals of this process are to ensure that anyone may submit a SIP in good
faith, that each SIP will receive fair and speedy good-faith consideration by
other people with the relevant expertise, and that any discussions and
decision-making on each SIP's ratification shall happen in public. To achieve
these ends, this document proposes a standard way of presenting a Stacks
Improvement Proposal (SIP), and a standard way of ratifying one.
Each SIP document contains all of the information needed to propose a
non-trivial change to the way in which the Stacks blockchain operates. This
includes both technical considerations, as well as operational and governance
considerations. This document proposes a formal document structure based on both
request-for-comments (RFC) practices in the Internet Engineering Task Force
(IETF), as well as existing blockchain networks.
SIPs must be ratified in order to be incorporated into the definition of what
the Stacks blockchain is, what it does, and how it operates. This document
proposes a ratification process based on existing governance processes from
existing open source projects (including Python, Bitcoin, Ethereum, and Zcash),
and makes provisions for creating and staffing various roles that people must
take on to carry out ratification (e.g. committees, editors, working groups and
so on).
This document uses the word “users” to refer specifically to people who
participate in the greater Stacks ecosystem. This includes, but is not limited
to, people who mine blocks, people who contribute code, people who run nodes,
people who develop applications that rely on the Stacks blockchain, people who
use such applications, people involved in the project governance, and people
involved in operating software deployments.
## SIP Format
All SIPs shall be formatted as markdown files. Each section shall be
annotated as a 2nd-level header (e.g. `##`). Subsections may be added with
lower-level headers.
Each SIP shall contain the following sections, in the given order:
- _Preamble_. This section shall provide fields useful for categorizing the SIP.
The required fields in all cases shall be:
- _SIP Number_. Each SIP receives a unique number once it has been accepted
for consideration for ratification (see below). This number is assigned to
a SIP; its author does not provide it.
- _Title_. A concise description of the SIP, no more than 20 words long.
- _Author_. A list of names and email addresses of the SIP's author(s).
- _Consideration_. What class of SIP this is (see below).
- _Type_. The SIP track for consideration (see below).
- _Status_. This SIP's point in the SIP workflow (see below).
- _Created_. The ISO 8601 date when this SIP was created.
- _License_. The content license for the SIP (see below for permitted
licenses).
- _Sign-off_. The list of relevant persons and their titles who have worked to
ratify the SIP. This field is not filled in entirely until ratification,
but is incrementally filled in as the SIP progresses through the ratification
process.
- Additional SIP fields, which are sometimes required, include:
- _Layer_. The logical layer of the Stacks blockchain affected. Must be one
- of the following:
- _Consensus (soft fork)_. For backwards-compatible proposals for
transaction-processing.
- _Consensus (hard fork)_. For backwards-incompatible proposals for
transaction-processing.
- _Peer Services_. For proposals to the peer-to-peer network protocol
stack.
- _API/RPC_. For proposals to the Stacks blockchain's official
programmatic interfaces.
- _Traits_. For proposals for new standardized Clarity trait definitions.
- _Applications_. For proposals for standardized application protocols
that interface with the Stacks blockchain.
- _Discussions-To_. A mailing list where ongoing discussion of the SIP takes
place.
- _Comments-Summary_. The comments summary tone.
- _Comments-URI_. A link to the Stacks blockchain wiki for comments.
- _License-Code_. Abbreviation for code under a different license than the SIP
proposal.
- _Post-History_. Dates of posting the SIP to the Stacks mailing list, or a
link to a thread with the mailing list.
- _Requires_. A list of SIPs that must be implemented prior to this SIP.
- _Replaces_. A list of SIPs that this SIP replaces.
- _Superceded-By_. A list of SIPs that replace this SIP.
- _Abstract_. This section shall provide a high-level summary of the proposed
improvement. It shall not exceed 5000 words.
- _Copyright_. This section shall provide the copyright license that governs the
use of the SIP content. It must be one of the approved set of licenses (see
below).
- _Introduction_. This section shall provide a high-level summary of the
problem(s) that this SIP proposes to solve, as well as a high-level
description of how the proposal solves them. This section shall emphasize its
novel contributions, and briefly describe how they address the problem(s). Any
motivational arguments and example problems and solutions belong in this
section.
- _Specification_. This section shall provide the detailed technical
specification. It may include code snippits, diagrams, performance
evaluations, and other supplemental data to justify particular design decisions.
However, a copy of all external supplemental data (such as links to research
papers) must be included with the SIP, and must be made available under an
approved copyright license.
- _Related Work_. This section shall summarize alternative solutions that address
the same or similar problems, and briefly describe why they are not adequate
solutions. This section may reference alternative solutions in other blockchain
projects, in research papers from academia and industry, other open-source
projects, and so on. This section must be accompanied by a bibliography of
sufficient detail such that someone reading the SIP can find and evaluate the
related works.
- _Backwards Compatibility_. This section shall address any
backwards-incompatiblity concerns that may arise with the implementation of
this SIP, as well as describe (or reference) technical mitigations for breaking
changes. This section may be left blank for non-technical SIPs.
- _Activation_. This section shall describe the timeline, falsifiable criteria,
and process for activating the SIP once it is ratified. This applies to both
technical and non-technical SIPs. This section is used to unambiguously
determine whether or not the SIP has been accepted by the Stacks users once it
has been submitted for ratification (see below).
- _Reference Implementations_. This section shall include one or more references
to one or more production-quality implementations of the SIP, if applicable.
This section is only informative — the SIP ratification process is independent
of any engineering processes (or other processes) that would be followed to
produce implementations. If a particular implementation process is desired,
then a detailed description of the process must be included in the Activation
section. This section may be updated after a SIP is ratified in order to
include an up-to-date listing of any implementations or embodiments of the SIP.
Additional sections may be included as appropriate.
### Supplemental Materials
A SIP may include any supplemental materials as
appropriate (within reason), but all materials must have an open format
unencumbered by legal restrictions. For example, an LibreOffice `.odp`
slide-deck file may be submitted as supplementary material, but not a Keynote
`.key` file.
When submitting the SIP, supplementary materials must be present within the same
directory, and must be named as `SIP-XXXX-YYY.ext`, where:
- `XXXX` is the SIP number,
- `YYY` is the serial number of the file, starting with 1,
- `.ext` is the file extension.
## SIP Types
The types of SIPs are as follows:
- _Consensus_. This SIP type means that all Stacks blockchain implementations
would need to adopt this SIP to remain compatible with one another. If this is
the SIP type, then the SIP preamble must have the Layer field set to either
_Consensus (soft fork)_ or _Consensus (hard fork)_.
- _Standard_. This SIP type means that the proposed change affects one or more
implementations, but does not affect network consensus. If this is the SIP
type, then the SIP preamble must have the Layer field set to indicate which
aspect(s) of the Stacks blockchain are affected by the proposal.
- _Operation_. This SIP type means that the proposal concerns the operation of the
Stacks blockchain -- in particular, it concerns node operators and miners.
The difference between this SIP type and the Standard type is that this type
does not change any existing protocols.
- _Meta_. This SIP type means that the proposal concerns the SIP ratification
process. Such a SIP is a proposal to change the way SIPs are handled.
- _Informational_. This is a SIP type that provides useful information, but does
not require any action to be taken on the part of any user.
New types of SIPs may be created with the ratification of a Meta-type SIP under
the governance consideration (see below). SIP types may not be removed.
## SIP Considerations
A SIP's consideration determines the particular steps needed to ratify the SIP
and incorporate it into the Stacks blockchain. Different SIP considerations have
different criteria for ratification. A SIP can have more than one consideration,
since a SIP may need to be vetted by different users with different domains of
expertise.
- _Technical_. The SIP is technical in nature, and must be vetted by users with
the relevant technical expertise.
- _Economic_. The SIP concerns the blockchain's token economics. This not only
includes the STX token, but also any on-chain tokens created within smart
contracts. SIPs that are concerned with fundraising methods, grants, bounties,
and so on also belong in this SIP track.
- _Governance_. The SIP concerns the governance of the Stacks blockchain,
including the SIP process. This includes amendments to the SIP Ratification
Process, as well as structural considerations such as the creation (or removal)
of various committees, editorial bodies, and formally recognized special
interest groups. In addition, governance SIPs may propose changes to the way by
which committee members are selected.
- _Ethics_. This SIP concerns the behaviors of office-holders in the SIP
Ratification Process that can affect its widespread adoption. Such SIPs
describe what behaviors shall be deemed acceptable, and which behaviors shall be
considered harmful to this end (including any remediation or loss of privileges
that misbehavior may entail). SIPs that propose formalizations of ethics like
codes of conduct, procedures for conflict resolution, criteria for involvement
in governance, and so on would belong in this SIP consideration.
- _Diversity_. This SIP concerns proposals to grow the set of users, with an
emphasis on including users who are traditionally not involved with
open-source software projects. SIPs that are concerned with evangelism,
advertising, outreach, and so on must have this consideration.
Each SIP consideration shall have a dedicated Advisory Board that ultimately
vets SIPs under their consideration for possible ratification in a timely
fashion (see below). New considerations may be created via the ratification of
a Meta-type SIP under the governance consideration.
## SIP Workflow
As a SIP is considered for ratification, it passes through multiple statuses as
determined by one or more committees (see next section). A SIP may have exactly
one of the following statuses at any given time:
- _Draft_. The SIP is still being prepared for formal submission. It does not yet
have a SIP number.
- _Accepted_. The SIP text is sufficiently complete that it constitutes a
well-formed SIP, and is of sufficient quality that it may be considered for
ratification. A SIP receives a SIP number when it is moved into the Accepted
state by SIP Editors.
- _Recommended_. The people responsible for vetting the SIPs under the
consideration(s) in which they have expertise have agreed that this SIP should
be implemented. A SIP must be Accepted before it can be Recommended.
- _Activation-In-Progress_. The SIP has been tentatively approved by the Steering
Committee for ratification. However, not all of the criteria for ratification
have been met according to the SIPs Activation section. For example, the
Activation section might require miners to vote on activating the SIPs
implementations, which would occur after the SIP has been transferred into
Activation-In-Progress status but before it is transferred to Ratified status.
- _Ratified._ The SIP has been activated according to the procedures described in
its Activation section. Once ratified, a SIP remains ratified in perpetuity,
but a subsequent SIP may supersede it. If the SIP is a Consensus-type SIP, and
then all Stacks blockchain implementations must implement it. A SIP must be
Recommended before it can be Ratified. Moving a SIP into this state may be done
retroactively, once the SIP has been activated according to the terms in its
Activation section.
- _Rejected_. The SIP does not meet at least one of the criteria for ratification
in its current form. A SIP can become Rejected from any state, except
Ratified. If a SIP is moved to the Rejected state, then it may be re-submitted
as a Draft.
- _Obsolete_. The SIP is deprecated, but its candidacy for ratification has not
been officially withdrawn (e.g. it may warrant further discussion). An
Obsolete SIP may not be ratified, and will ultimately be Withdrawn.
- _Replaced_. The SIP has been superseded by a different SIP. Its preamble must
have a Superseded-By field. A Replaced SIP may not be ratified, nor may it be
re-submitted as a Draft-status SIP. It must be transitioned to a Withdrawn
state once the SIP(s) that replace it have been processed.
- _Withdrawn_. The SIP's authors have ceased working on the SIP. A Withdrawn SIP
may not be ratified, and may not be re-submitted as a Draft. It must be
re-assigned a SIP number if taken up again.
The act of ratifying a SIP is the act of transitioning it to the Ratified status
-- that is, moving it from Draft to Accepted, from Accepted to Recommended, and
Recommended to Activation-In-Progress, and from Activation-In-Progress to
Ratified, all without the SIP being transitioned to Rejected, Obsolete,
Replaced, or Withdrawn status. A SIP's current status is recorded in its Status
field in its preamble.
## SIP Committees
The act of deciding the status of a SIP is handled by a set of designated
committees. These committees are composed of users who dedicate their time and
expertise to curate the blockchain, ratifying SIPs on behalf of the rest of the
ecosystems users.
There are three types of committee:
- _Steering Committee (SC)_. The roles of the SC are to select Recommended-status
SIPs to be activated, to determine whether or not a SIP has been activated and
thus ratified, and to formally recognize Consideration Advisory Boards (see
below).
- _Consideration Advisory Boards_. The roles of the Consideration Advisory Boards
are to provide expert feedback on SIPs that have been moved to Accepted status
in a timely manner, and to transition SIPs to Recommended status if they meet
the Board's consideration criteria, and Rejected status otherwise.
- _SIP Editors_. The role of the SIP Editors is to identify SIPs in the Draft
status that can be transitioned to Accepted status. A SIP editor must be able
to vet a SIP to ensure that it is well-formed, that it follows the ratification
workflow faithfully, and that it does not overlap with any already-Accepted SIPs
or SIPs that have since become Recommended or Ratified.
Any user may serve on a committee. However, all Stacks committee members must
abide by the SIP Code of Conduct and must have a history of adhering to it.
Failure to adhere to the Code of Conduct shall be grounds for immediate removal
from a committee, and a prohibition against serving on any future committees.
### Compensation
Compensation for carrying out committee duties is outside of the scope of this
document. This document does not create a provision for compensation for
committee participation, but it does not forbid it either.
### Steering Committee Duties
The Steering Committee's overarching duty is to oversee the evolution of the
Stacks blockchains design, operation, and governance, in a way that is
technically sound and feasible, according to the rules and procedures described
in this document. The SC shall be guided by and held accountable by the greater
community of users, and shall make all decisions with the advice of the relevant
Consideration Advisory Boards.
The SCs role is that of a steward. The SC shall select SIPs for ratification
based on how well they serve the greater good of the Stacks users. Given the
nature of blockchains, the SC's particular responsibilities pertaining to
upgrading the blockchain network are meant to ensure that upgrades happen in a
backwards-compatible fashion if at all possible. While this means that more
radical SIPs may be rejected or may spend a long amount of time in Recommended
status, it also minimizes the chances of an upgrade leading to widespread
disruption (the minimization of which itself serves the greater good).
#### Membership
The initial Steering Committee shall be comprised of at least three members:
two from the Stacks Open Internet Foundation, and one
from the greater Stacks blockchain community (independent of the Stacks
Foundation).
A provisional Steering Committee will be appointed by the Stacks Open Internet Foundation Board
before the launch of the Stacks blockchains mainnet (see the "Activation" section).
Once this SIP activates, the Stacks Open Internet Foundation shall select its
representatives in a manner of their choosing within 90 days after activation.
The committee may be expanded later to include more seats. Once this SIP
activates, the provisional SC will consult with the community to
ratify a SIP that implements a voting procedure whereby
Stacks community members can select the individual who will serve on the
community SC seat.
#### Qualifications
Members of this committee must have deep domain expertise
pertinent to blockchain development, and must have excellent written
communication skills. It is highly recommended that members should have authored
at least one ratified technical-consideration SIP before joining this committee.
#### Responsibilities
The Steering Committee shall be responsible for the following
tasks.
##### Recognizing Consideration Advisory Boards.
The members of the Steering Committee
must bear in mind that they are not infallible, and that they do not know everything
there is to know about what is best for the broader user community. To the
greatest extent practical, the SC shall create and foster the development of
Consideration Advisory Boards in order make informed decisions on subjects that
in which they may not be experts.
Any group of users can form an unofficial working group to help provide feedback
to SIPs, but the SC shall have the power to recognize such groups formally as a
Consideration Advisory Board via at least a two-thirds majority vote. The SC
shall simultaneously recognize one of its member to serve as the interim
chairperson while the Advisory Board forms. A SC member cannot normally serve on
a Consideration Advisory Board concurrently with serving on the SC, unless
granted a limited exception by a unanimous vote by the SC (e.g. in order to
address the Boards business while a suitable chairperson is found). Formally
recognizing Consideration Advisory Boards shall occur in Public Meetings (see
below) no more than once per quarter.
Once recognized, Consideration Advisory Boards may not be dissolved or
dismissed, unless there are no Accepted or Recommended SIPs that request their
consideration. If this is the case, then the SC may vote to rescind recognition
of a Consideration Advisory Board with a two-thirds majority at one of its
Public Meetings.
In order to identify users who would form a Consideration Advisory Board, users
should organize into an unofficial working group and submit a SIP to petition
that SC recognize the working group as a Consideration Advisory Board. This
petition must take the form of a Meta-type SIP, and may be used to select the
initial chairperson and define the Board's domain(s) of expertise, bylaws,
membership, meeting procedures, communication channels, and so on, independent
of the SC. The SC would only be able to ratify or reject the SIP.
The SC shall maintain a public index of all Consideration Advisory Boards that
are active, including contact information for the Board and a summary of what
kinds of expertise the Board can offer. This index is meant to be used by SIP
authors to help route their SIPs towards the appropriate reviewers before being
taken up by the SC.
##### Voting on Technical SIPs
The Steering Committee shall select Recommended SIPs
for ratification by moving them to Activation-In-Progress status. All
technical-consideration SIPs shall require an 80% vote. If it is a
Consensus-type SIP for a hard fork, then a unanimous vote shall be required. If
a SIP is voted on and is not moved to Activation-in-Progress, then it shall be
moved to Rejected status, and the SC shall provide a detailed explanation as to
why they made their decision (see below).
##### Voting on Non-technical SIPs
Not all SIPs are technical in nature. All
non-technical SIPs shall require only a two-thirds majority vote to transition
it to Activation-In-Progress status. The SC members must provide a public
explanation for the way it voted as supplementary materials with the ratified
non-technical SIP (see below). If the SC votes to move a non-technical SIP to
Activation-In-Progress status, but does not receive the requisite number of
votes, then the SIP shall be transferred to Rejected status, and the SC shall
provide a detailed explanation as to why they made their decision (see below).
##### Overseeing SIP Activation and Ratification
Once a SIP is in Activation-In-Progress status,
the SC shall be responsible for overseeing the procedures and criteria in the
SIPs Activation section. The Activation section of a SIP can be thought of as
an “instruction manual” and/or “checklist” for the SC to follow to determine if
the SIP has been accepted by the Stacks users. The SC shall strictly adhere to
the process set forth in the Activation section. If the procedure and/or
criteria of the Activation section cannot be met, then the SC may transfer the
SIP to Rejected status and ask the authors to re-submit the SIP with an updated
Activation section.
Once all criteria have been unambiguously meet and all activation procedures
have been followed, the SC shall transition the SIP to Ratified status. The SC
shall keep a log and provide a record of the steps they took in following a
SIPs Activation section once the SIP is in Activation-In-Progress status, and
publish them alongside the Ratified SIP as supplemental material.
Due to the hands-on nature of the Activation section, the SC may deem it
appropriate to reject a SIP solely on the quality of its Activation section.
Reasonable grounds for rejection include, but are not limited to, ambiguous
instructions, insufficiently-informative activation criteria, too much work on
the SC members parts, the lack of a prescribed activation timeout, and so on.
Before the Stacks mainnet launches, the SC shall ratify a SIP that, when
activated according to the procedures outlined in its Activation section, will
allow Stacks blockchain miners to signal their preferences for the activation of
particular SIPs within the blocks that they mine. This will enable the greater
Stacks community of users to have the final say as to which SIPs activate and
become ratified.
##### Feedback on Recommended SIPs
The Steering Committee shall give a full, fair,
public, and timely evaluation to each SIP transitioned to Recommended status by
Consideration Advisory Boards. A SIP shall only be considered by the SC if the
Consideration Advisory Board chairpeople for each of the SIP's considerations
have signed-off on the SIP (by indicating as such on the SIP's preamble).
The SC may transition a SIP to Rejected status if it disagrees with the
Consideration Advisory Boards' recommendation. The SC may transition a SIP to
Obsolete status if it finds that the SIP no longer addresses a relevant concern.
It may transition the SIP to a Replaced status if it considers a similar,
alternative SIP that is more likely to succeed. In all cases, the SC shall
ensure that a SIP does not remain in Recommended status for an unreasonable
amount of time.
The SC shall maintain a public record of all feedbacks provided for each SIP it
reviews.
If a SIP is moved to Rejected, Obsolete, or Replaced status, the SIP authors may
appeal the process by re-submitting it in Draft status once the feedback has
been addressed. The appealed SIP must cite the SCs feedback as supplemental
material, so that SIP Editors and Consideration Advisory Boards are able to
verify that the feedback has, in fact, been addressed.
##### Public Meetings
The Steering Committee shall hold and record regular public
meetings at least once per month. The SC may decide the items of business for
these meetings at its sole discretion, but it shall prioritize business
pertaining to the ratification of SIPs, the recognition of Consideration
Advisory Boards, and the needs of all outstanding committees. That said, any
user may join these meetings as an observer, and the SC shall make a good-faith
effort to address public comments from observers as time permits.
The SC shall appoint up to two dedicated moderators from the user community for
its engineering meetings, who shall vet questions and commentary from observers
in advance (possibly well before the meeting begins). If there is more than one
moderator, then the moderators may take turns. In addition, the SC shall appoint
a dedicated note-taker to record the minutes of the meetings. All of these
appointees shall be eligible to receive a fixed, regular bounty for their work.
### Consideration Advisory Board Duties
There is an Advisory Board for each SIP consideration, with a designated
chairperson responsible for maintaining copies of all discussion and feedback on
the SIPs under consideration.
#### Membership
All Consideration Advisory Boards begin their life as unofficial
working groups of users who wish to review inbound SIPs according to their
collective expertise. If they wish to be recognized as an official
Consideration Advisory Board, they shall submit a SIP to the Steering Committee
per the procedure described in the Steering Committees duties. Each
Consideration Advisory Board shall be formally created by the SC with a
designated member serving as its first interim chairperson. After this, the
Consideration Advisory Board may adopt its own bylaws for selecting members and
chairpeople. However, members should have domain expertise relevant to the
consideration.
#### Members
shall serve on their respective Consideration Advisory Boards so long as
they are in good standing with the SIP Code of Conduct and in accordance to the
individual Boards bylaws. A user may serve on at most three Consideration
Advisory Boards concurrently.
#### Qualifications
Each Consideration Advisory Board member shall have sufficient
domain expertise to provide the Steering Committee with feedback pertaining to a
SIP's consideration. Members shall possess excellent written communication
skills.
#### Responsibilities
Each Consideration Advisory Board shall be responsible for the
following.
##### Chairperson
Each Consideration Advisory Board shall appoint a chairperson, who
shall serve as the point of contact between the rest of the Board and the
Steering Committee. If the chairperson becomes unresponsive, the SC may ask the
Board to appoint a new chairperson (alternatively, the Board may appoint a new
chairperson on its own and inform the SC). The chairperson shall be responsible
for maintaining the Boards public list of members names and contact
information as a supplementary document to the SIP that the SC ratified to
recognize the Board.
##### Consideration Track
Each Consideration Advisory Board shall provide a clear and
concise description of what expertise it can offer, so that SIP authors may
solicit it with confidence that it will be helpful. The chairperson shall make
this description available to the Steering Committee and to the SIP Editors, so
that both committees can help SIP authors ensure that they receive the most
appropriate feedback.
The description shall be provided and updated by the chairperson to the SC so
that the SC can provide a public index of all considerations a SIP may possess.
##### Feedback
to SIP Authors Each Consideration Advisory Board shall provide a full,
fair, public, and timely evaluation of any Accepted-status SIP that lists the
Board's consideration in its preamble. The Board may decide to move each SIP to
a Recommended status or a Rejected status based on whether or not the Board
believes that the SIP is feasible, practical, and beneficial to the greater
Stacks ecosystem.
Any feedback created shall be made public. It is the responsibility of the Board
to store and publish all feedbacks for the SIPs it reviews. It shall forward
copies of this feedback to both the SIP authors.
##### Consultation with the Steering Committee
The Steering Committee may need to
follow up with the Consideration Advisory Board in order to clarify its position
or solicit its advice on a particular SIP. For example, the SC may determine
that a Recommended SIP needs to be considered by one or more additional Boards
that have not yet been consulted by the SIP authors.
The Board shall respond to the SC's request for advice in a timely manner, and
shall prioritize feedback on SIPs that are under consideration for ratification.
### SIP Editor Duties
By far the largest committee in the SIP process is the SIP Editor Committee.
The SIP Editors are responsible for maintaining the "inbound funnel" for SIPs
from the greater Stacks community. SIP Editors ensure that all inbound SIPs are
well-formed, relevant, and do not duplicate prior work (including rejected
SIPs).
#### Membership
Anyone may become a SIP Editor by recommendation from an existing SIP
Editor, subject to the “Recruitment” section below.
#### Qualifications
A SIP Editor must demonstrate proficiency in the SIP process and
formatting requirements. A candidate SIP Editor must demonstrate to an existing
SIP Editor that they can independently vet SIPs.
#### Responsibilities
SIP Editors are concerned with shepherding SIPs from Draft
status to Accepted status, and for mentoring community members who want to get
involved with the SIP processes (as applicable).
##### Getting Users Started
SIP Editors should be open and welcoming towards
enthusiastic users who want to help improve the greater Stacks ecosystem. As
such, SIP Editors should encourage users to submit SIPs if they have good ideas
that may be worth implementing.
In addition, SIP Editors should respond to public requests for help from
community members who want to submit a SIP. They may point them towards this
document, or towards other supplemental documents and tools to help them get
started.
##### Feedback
When a SIP is submitted in Draft status, a SIP Editor that takes the
SIP into consideration should provide fair and full feedback on how to make the
SIP ready for its transition to Accepted status.
To do this, the SIP Editor should:
- Verify that the SIP is well-formed according to the criteria in this document
- Verify that the SIP has not been proposed before
- Verify as best that they can that the SIP is original work
- Verify that the SIP is appropriate for its type and consideration
- Recommend additional Considerations if appropriate
- Ensure that the text is clear, concise, and grammatically-correct English
- Ensure that there are appropriate avenues for discussion of the SIP listed in
the preamble.
The SIP Editor does not need to provide public feedback to the SIP authors, but
should add their name(s) to the Signed-off field in the SIP preamble once the
SIP is ready to be Accepted.
##### Acceptance
Once a SIP is moved to Accepted, the SIP Editor shall assign it the
smallest positive number not currently used to identify any other SIP. Once that
number is known, the SIP Editor shall set the SIP's status to Accepted, set the
number, and commit the SIP to the SIP repository in order to make it visible to
other SIP Editors and to the Consideration Advisory Boards.
##### Recruitment
Each SIP Editor must list their name and contact information in an
easy-to-find location in the SIP repository, as well list of each SIP Editor
they recommended. In so doing, the SIP Editors shall curate an “invite tree”
that shows which Editors recommended which other Editors.
A SIP Editor may recommend another user to be a SIP Editor no more than once per
month, and only if they have faithfully moved at least one SIP to Accepted
status in the last quarter. If a SIP Editor does not participate in editing a
SIP for a full year and a day, then they may be removed from the SIP Editor
list. The SC may remove a SIP Editor (and some or all of the users he or she
recommended) if they find that the SIP Editor has violated the SIP Code of
Conduct.
Newly-Accepted SIPs, new SIP Editor recruitment, and SIP Editor retirement shall
be submitted as pull requests by SIP Editors to the SIP repository.
## SIP Workflow
The lifecycle of a SIP is summarized in the flow-chart below:
```
------------------
| Draft | <-------------------------. Revise and resubmit
------------------ |
| --------------------
Submit to SIP Editor -------------> | Rejected |
| --------------------
| ^
V |
------------------ |
| Accepted | -------------------------/ | /--------------------------------.
------------------ | |
| -------------------- |
Review by Consideration ----------> | Rejected | |
Advisory Board(s) -------------------- |
| ^ |
V | |
------------------------- | |
| Recommended | -----------------/ | /------------------------------->|
------------------------- | |
| -------------------- |
Vote by the Steering -----------> | Rejected | |
Committee for activation -------------------- |
| ^ |
V | |
-------------------------- | |
| Activation-in-Progress | -----------------/ | /------------------------------->|
-------------------------- | |
| --------------------- |
All activation ------------------> | Rejected | |
criteria are met | --------------------- ------------------ |
| |----------------------------------> | Obsolete | |
V | --------------------- ------------------ |
------------------ *---> | Replaced | --------------->|<-----------*
| Ratified | --------------------- |
------------------ V
-------------------
| Withdrawn |
-------------------
```
When a SIP is transitioned to Rejected, it is not deleted, but is preserved in
the SIP repository so that it can be referenced as related or prior work by
other SIPs. Once a SIP is Rejected, it may be re-submitted as a Draft at a later
date. SIP Editors may decide how often to re-consider rejected SIPs as an
anti-spam measure, but the Steering Committee and Consideration Advisory Boards
may opt to independently re-consider rejected SIPs at their own discretion.
## Public Venues for Conducting Business
The canonical set of SIPs in all state shall be recorded in the same medium that
the canonical copy of this SIP is. Right now, this is in the Github repository
https://github.com/stacksorg/sips, but may be changed before this SIP is
ratified. New SIPs, edits to SIPs, comments on SIPs, and so on shall be
conducted through Github's facilities for the time being.
In addition, individual committees may set up and use public mailing lists for
conducting business. The Stacks Open Internet Foundation shall provide a means
for doing so. Any discussions on the mailing lists that lead to non-trivial
contributions to SIPs should be referenced by these SIPs as supplemental
material.
### Github-specific Considerations
All SIPs shall be submitted as pull requests, and all SIP edits (including status
updates) shall be submitted as pull requests. The SC, or one or more
individuals or entities appointed by the SC, shall be responsible for merging
pull requests to the main branch.
## SIP Copyright & Licensing
Each SIP must identify at least one acceptable license in its preamble. Source
code in the SIP can be licensed differently than the text. SIPs whose reference
implementation(s) touch existing reference implementation(s) must use the same
license as the existing implementation(s) in order to be considered. Below is a
list of recommended licenses.
- BSD-2-Clause: OSI-approved BSD 2-clause license
- BSD-3-Clause: OSI-approved BSD 3-clause license
- CC0-1.0: Creative Commons CC0 1.0 Universal
- GNU-All-Permissive: GNU All-Permissive License
- GPL-2.0+: GNU General Public License (GPL), version 2 or newer
- LGPL-2.1+: GNU Lesser General Public License (LGPL), version 2.1 or newer
# Related Work
The governance process proposed in this SIP is inspired by the Python PEP
process [1], the Bitcoin BIP2 process [2], the Ethereum Improvement Proposal [3]
processes, the Zcash governance process [4], and the Debian GNU/Linux
distribution governance process [5]. This SIP describes a governance process
where top-level decision-making power is vested in a committee of elected
representatives, which distinguishes it from Debian (which has a single elected
project leader), Python (which has a benevolent dicator for life), and Bitcoin
and ZCash (which vest all decision ratification power solely in the blockchain
miners). The reason for a top-level steering committee is to ensure that
decision-making power is not vested in a single individual, but also to ensure
that the individuals responsible for decisions are accountable to the community
that elects them (as opposed to only those who have the means to participate
in mining). This SIP differs from Ethereum's governance
process in that the top-level decision-making body (the "Core Devs" in Ethereum,
and the Steering Committee in Stacks) is not only technically proficient to evaluate
SIPs, but also held accountable through an official governance
process.
[1] https://www.python.org/dev/peps/pep-0001/
[2] https://github.com/bitcoin/bips/blob/master/bip-0002.mediawiki
[3] https://eips.ethereum.org/
[4] https://www.zfnd.org/governance/
[5] https://debian-handbook.info/browse/stable/sect.debian-internals.html
# Activation
This SIP activates once following tasks have been carried out:
- The provisional Steering Committee must be appointed by the Stacks Open Internet
Foundation Board.
- Mailing lists for the initial committees must be created.
- The initial Consideration Advisory Boards must be formed, if there is interest
in doing so before this SIP activates.
- A public, online SIP repository must be created to hold all non-Draft SIPs, their edit
histories, and their feedbacks.
- A directory of Consideration Advisory Boards must be established (e.g. within
the SIP repository).
- A SIP Code of Conduct should be added as a supplemental document
- The Stacks blockchain mainnet must launch.
# Reference Implementation
Not applicable.
# Frequently Asked Questions
NOTE: this section will be expanded as necessary before ratification
This SIP is now located in the [stacksgov/sips repository](https://github.com/stacksgov/sips/blob/main/sips/sip-000/sip-000-stacks-improvement-proposal-process.md) as part of the [Stacks Community Governance organization](https://github.com/stacksgov).

File diff suppressed because it is too large Load Diff

View File

@@ -1,747 +1,5 @@
# Abstract
# SIP-002 Smart Contract Language
In order to support applications which require validation of some
pieces of their logic, we present a smart contracting language for use
with the Stacks blockchain. This smart contracting language can be
used on the Stacks blockchain to support programatic control over
digital assets within the Stacks blockchain (e.g., BNS names, Stacks
tokens, etc.)
This document formerly contained SIP-002 before the Stacks 2.0 mainnet launched.
While application-chains may use any smart-contract language that they
like, this smart contracting language's VM will be a part of
blockstack-core, and, as such, any blockstack-core node will be able to
validate application chains using this smart contracting language with
a simple configuration change.
This smart contracting language permits static analysis of any legal
smart contract to determine runtime costs. This smart contracting
language is not only Turing-incomplete (a requirement for such static
analysis to be guaranteed successful), but readily permits other kinds
of proofs to be made about the code as well.
# Design
A smart contract is composed of two parts:
1. A data-space, which is a set of tables of data which only the
smart contract may modify
2. A set of functions which operate within the data-space of the
smart contract, though they may call public functions from other smart
contracts.
Users call smart contracts' public functions by broadcasting a
transaction on the blockchain which invokes the public function.
This smart contracting language differs from most other smart
contracting languages in two important ways:
1. The language _is not_ intended to be compiled. The LISP language
described in this document is the specification for correctness.
2. The language _is not_ Turing complete. This allows us to guarantee
that static analysis of programs to determine properties like
runtime cost and data usage can complete successfully.
## Specifying Contracts
A smart contract definition is specified in a LISP language with the
following limitations:
1. Recursion is illegal and there is no `lambda` function.
2. Looping may only be performed via `map`, `filter`, or `fold`
3. The only atomic types are booleans, integers, fixed length
buffers, and principals
4. There is additional support for lists of the atomic types, however
the only variable length lists in the language appear as function
inputs (i.e., there is no support for list operations like append
or join).
5. Variables may only be created via `let` binding and there
is no support for mutating functions like `set`.
6. Defining of constants and functions are allowed for simplifying
code using `define-private` statement. However, these are purely
syntactic. If a definition cannot be inlined, the contract will be
rejected as illegal. These definitions are also _private_, in that
functions defined this way may only be called by other functions
defined in the given smart contract.
7. Functions specified via `define-public` statements are _public_
functions.
8. Functions specified via `define-read-only` statements are _public_
functions and perform _no_ state mutations. Any attempts to
modify contract state by these functions or functions called by
these functions will result in an error.
Public functions return a Response type result. If the function returns
an `ok` type, then the function call is considered valid, and any changes
made to the blockchain state will be materialized. If the function
returns an `err` type, it will be considered invalid, and will have _no
effect_ on the smart contract's state. So if function `foo.A` calls
`bar.B`, and `bar.B` returns an `ok`, but `foo.A` returns an `err`, no
effects from calling `foo.A` materialize--- including effects from
`bar.B`. If, however, `bar.B` returns an `err` and `foo.A` returns an `ok`,
there may be some database effects which are materialized from
`foo.A`, but _no_ effects from calling `bar.B` will materialize.
Unlike functions created by `define-public`, which may only return
Response types, functions created with `define-read-only` may return
any type.
## List Operations
* Lists may be multi-dimensional (i.e., lists may contain other lists), however each
entry of this list must be of the same type.
* `filter` `map` and `fold` functions may only be called with user-defined functions
(i.e., functions defined with `(define-private ...)`, `(define-read-only ...)`, or
`(define-public ...)`) or simple native functions (e.g., `+`, `-`, `not`).
* Functions that return lists of a different size than the input size
(e.g., `(append-item ...)`) take a required _constant_ parameter that indicates
the maximum output size of the function. This is enforced with a runtime check.
## Inter-Contract Calls
A smart contract may call functions from other smart contracts using a
`(contract-call?)` function.
This function returns a Response type result-- the return value of the called smart
contract function. Note that if a called smart contract returns an
`err` type, it is guaranteed to not alter any smart contract state
whatsoever. Of course, any transaction fees paid for the execution
of that function will not be returned.
We distinguish 2 different types of `contract-call?`:
* Static dispatch: the callee is a known, invariant contract available
on-chain when the caller contract is being deployed. In this case, the
callee's principal is provided as first argument, followed by the name
of the method and its arguments:
```scheme
(contract-call?
'SC3H92H297DX3YDPFHZGH90G8Z4NPH4VE8E83YWAQ.registrar
register-name
name-to-register)
```
This approach must always be preferred, when adequate.
It makes static analysis easier, and eliminates the
potential for reentrancy bugs when the contracts are
being published (versus when being used).
* Dynamic dispatch: the callee is passed as an argument, and typed
as a trait reference (<A>).
```scheme
(define-public (swap (token-a <can-transfer-tokens>)
(amount-a uint)
(owner-a principal)
(token-b <can-transfer-tokens>)
(amount-b uint)
(owner-b principal)))
(begin
(unwrap! (contract-call? token-a transfer-from? owner-a owner-b amount-a))
(unwrap! (contract-call? token-b transfer-from? owner-b owner-a amount-b))))
```
Traits can either be locally defined:
```scheme
(define-trait can-transfer-tokens (
(transfer-from? (principal principal uint) (response uint)))
```
Or imported from an existing contract:
```scheme
(use-trait can-transfer-tokens
'SC3H92H297DX3YDPFHZGH90G8Z4NPH4VE8E83YWAQ.contract-defining-trait.can-transfer-tokens)
```
Looking at trait conformance, callee contracts have two different paths.
They can either be "compatible" with a trait by defining methods
matching some of the methods defined in a trait, or explicitely declare
conformance using the `impl-trait` statement:
```scheme
(impl-trait 'SC3H92H297DX3YDPFHZGH90G8Z4NPH4VE8E83YWAQ.contract-defining-trait.can-transfer-tokens)
```
Explicit conformance should be prefered when adequate.
It acts as a safeguard by helping the static analysis system to detect
deviations in method signatures before contract deployment.
The following limitations are imposed on contract calls:
1. On static dispatches, callee smart contracts _must_ exist at the time of creation.
2. No cycles may exist in the call graph of a smart contract. This
prevents recursion (and re-entrancy bugs). Such structures can
be detected with static analysis of the call graph, and will be
rejected by the network.
3. `contract-call?` are for inter-contract calls only. Situations
where the caller is also the callee will result in abortion of
the ongoing transaction.
## Principals and Owner Verification
The language provides a primitive for checking whether or not the
smart contract transaction was signed by a particular
_principal_. Principals are a specific type in the smart contracting
language which represent a spending entity (roughly equivalent to a
Stacks address). The signature itself is not checked by the smart
contract, but by the VM. A smart contract function can use a globally
defined variable to obtain the current principal:
```scheme
tx-sender
```
The `tx-sender` variable does not change during inter-contract
calls. This means that if a transaction invokes a function in a given
smart contract, that function is able to make calls into other smart
contracts without that variable changing. This enables a wide variety
of applications, but it comes with some dangers for users of smart
contracts. However, as mentioned before, the static analysis
guarantees of our smart contracting language allow clients to know a
priori which functions a given smart contract will ever call.
Another global variable, `contract-caller`, _does_ change during
inter-contract calls. In particular, `contract-caller` is the contract
principal corresponding to the most recent invocation of `contract-call?`.
In the case of a "top-level" invocation, this variable is equal to `tx-sender`.
Assets in the smart contracting language and blockchain are
"owned" by objects of the principal type, meaning that any object of
the principal type may own an asset. For the case of public-key hash
and multi-signature Stacks addresses, a given principal can operate on
their assets by issuing a signed transaction on the blockchain. _Smart
contracts_ may also be principals (reprepresented by the smart
contract's identifier), however, there is no private key associated
with the smart contract, and it cannot broadcast a signed transaction
on the blockchain.
In order to allow smart contracts to operate on assets it owns, smart
contracts may use the special function:
```scheme
(as-contract (...))
```
This function will execute the closure (passed as an argument) with
the `tx-sender` and `contract-caller` set to the _contract's_
principal, rather than the current sender. It returns the return value
of the provided closure. A smart contract may use the special variable
`contract-principal` to refer to its own principal.
For example, a smart contract that implements something like a "token
faucet" could be implemented as so:
```scheme
(define-public (claim-from-faucet)
(if (is-none? (map-get claimed-before (tuple (sender tx-sender))))
(let ((requester tx-sender)) ;; set a local variable requester = tx-sender
(map-insert! claimed-before (tuple (sender requester)) (tuple (claimed true)))
(as-contract (stacks-transfer! requester 1))))
(err 1))
```
Here, the public function `claim-from-faucet`:
1. Checks if the sender has claimed from the faucet before
2. Assigns the tx sender to a requester variable
3. Adds an entry to the tracking map
4. Uses `as-contract` to send 1 microstack
The primitive function `is-contract?` can be used to determine
whether a given principal corresponds to a smart contract.
## Stacks Transfer Primitives
To interact with Stacks balances, smart contracts may call the
`(stacks-transfer!)` function. This function will attempt to transfer
from the current principal to another principal:
```scheme
(stacks-transfer!
to-send-amount
recipient-principal)
```
This function itself _requires_ that the operation have been signed by
the transferring principal. The `integer` type in our smart contracting
language is an 16-byte signed integer, which allows it to specify the
maximum amount of microstacks spendable in a single Stacks transfer.
Like any other public smart contract function, this function call
returns an `ok` if the transfer was successful, and `err` otherwise.
## Data-Space Primitives
Data within a smart contract's data-space is stored within
`maps`. These stores relate a typed-tuple to another typed-tuple
(almost like a typed key-value store). As opposed to a table data
structure, a map will only associate a given key with exactly one
value. Values in a given mapping are set or fetched using:
1. `(map-get map-name key-tuple)` - This fetches the value
associated with a given key in the map, or returns `none` if there
is no such value.
2. `(map-set! map-name key-tuple value-tuple)` - This will set the
value of `key-tuple` in the data map
3. `(map-insert! map-name key-tuple value-tuple)` - This will set
the value of `key-tuple` in the data map if and only if an entry
does not already exist.
4. `(map-delete! map-name key-tuple)` - This will delete `key-tuple`
from the data map
We chose to use data maps as opposed to other data structures for two
reasons:
1. The simplicity of data maps allows for both a simple implementation
within the VM, and easier reasoning about functions. By inspecting a
given function definition, it is clear which maps will be modified and
even within those maps, which keys are affected by a given invocation.
2. The interface of data maps ensures that the return types of map
operations are _fixed length_, which is a requirement for static
analysis of smart contracts' runtime, costs, and other properties.
A smart contract defines the data schema of a data map with the
`define-map` call. The `define-map` function may only be called in the
top-level of the smart-contract (similar to `define-private`). This
function accepts a name for the map, and a definition of the structure
of the key and value types. Each of these is a list of `(name, type)`
pairs, and they specify the input and output type of `map-get`.
Types are either the values `'principal`, `'integer`, `'bool` or
the output of a call to `(buffer n)`, which defines an n-byte
fixed-length buffer.
This interface, as described, disallows range-queries and
queries-by-prefix on data maps. Within a smart contract function,
you cannot iterate over an entire map.
### Record Type Syntax
To support the use of _named_ fields in keys and values, our language
allows the construction of named tuples using a function `(tuple ...)`,
e.g.,
```
(define-constant imaginary-number-a (tuple (real 1) (i 2)))
(define-constant imaginary-number-b (tuple (real 2) (i 3)))
```
This allows for creating named tuples on the fly, which is useful for
data maps where the keys and values are themselves named tuples. To
access a named value of a given tuple, the function `(get #name
tuple)` will return that item from the tuple.
### Time-shifted Evaluations
The Stacks language supports _historical_ data queries using the
`(at-block)` function:
```
(at-block 0x0101010101010101010101010101010101010101010101010101010101010101
; returns owner principal of name represented by integer 12013
; at the time of block 0x010101...
(map-get name-map 12013))
```
This function evaluates the supplied closure as if evaluated at the end of
the supplied block, returning the resulting value. The supplied
closure _must_ be read-only (is checked by the analysis).
The supplied block hash must correspond to a known block in the same
fork as the current block, otherwise a runtime error will occur and the
containing transaction will _fail_. Note that if the supplied block
pre-dates any of the data structures being read within the closure (i.e.,
the block is before the block that constructed a data map), a runtime
error will occur and the transaction will _fail_.
## Library Support and Syntactic Sugar
There are a number of ways that the developer experience can be
improved through the careful addition of improved syntax. For example,
while the only atomic types supported by the smart contract language
are integers, buffers, booleans, and principals, so if a developer
wishes to use a buffer to represent a fixed length string, we should
support syntax for representing a buffer literal using something like
an ASCII string. Such support should also be provided by transaction
generation libraries, where buffer arguments may be supplied strings
which are then automatically converted to buffers. There are many
possible syntactic improvements and we expect that over the course
of developing the prototype, we will have a better sense for which
of those improvements we should support. Any such synactic changes
will appear in an eventual language specification, but we believe
them to be out of scope for this proposal.
# Static Analysis
One of the design goals of our smart contracting language was the
ability to statically analyze smart contracts to obtain accurate
upper-bound estimates of transaction costs (i.e., runtime and storage
requirements) as a function of input lengths. By limiting the types
supported, the ability to recurse, and the ability to iterate, we
believe that the language as presented is amenable to such static
analysis based on initial investigations.
The essential step in demonstrating the possibility of accurate and
useful analysis of our smart contract definitions is demonstrating
that any function within the language specification has an output
length bounded by a constant factor of the input length. If we can
demonstrate this, then statically computing runtime or space
requirements involves merely associating each function in the language
specification with a way to statically determine cost as a function of
input length.
Notably, the fact that the cost functions produced by static analysis
are functions of _input length_ means the following things:
1. The cost of a cross-contract call can be "memoized", such
that a static analyzer _does not_ need to recompute any
static analysis on the callee when analyzing a caller.
2. The cost of a given public function on a given input size
_is always the same_, meaning that smart contract developers
do not need to reason about different cases in which a given
function may cost more or less to execute.
## Bounding Function Output Length
Importantly, our smart contracting language does not allow the
creation of variable length lists: there are no `list` or
`cons` constructors, and buffer lengths must be statically
defined. Under such requirements (and given that recursion is
illegal), determining the output lengths of functions is rather
directly achievable. To see this, we'll examine trying to compute the
output lengths for the only functions allowed to iterate in the
language:
```
outputLen(map f list<t>) := Len(list<t>) * outputLen(f t)
outputLen(filter f list<t>) := Len(list<t>)
outputLen(fold f list<t> s) := Len(s)
```
Many functions within the language will output values larger than the
function's input, _however_, these outputs will be bound by
statically inferable constants. For example, the data function
_map-get_ will always return an object whose size is equal
to the specified value type of the map.
A complete proof for the static runtime analysis of smart contracts
will be included with the implementation of the language.
# Deploying the Smart Contract
Smart contracts on the Stacks blockchain will be deployed directly as
source code. The goal of the smart contracting language is that the
code of the contract defines the _ground truth_ about the intended
functionality of the contract. While seemingly banal, many systems
chose instead to use a compiler to translate from a friendly
high-level language to a lower-level language deployed on the
blockchain. Such an architecture is needlessly dangerous. A bug in
such a compiler could lead to a bug in a deployed smart contract when
no such bug exists in the original source. This is problematic for
recovery --- a hard fork to "undo" any should-have-been invalid
transactions would be contentious and potentially create a rift in the
community, especially as it will not be easy to deduce which contracts
exactly were affected and for how long. In contrast, bugs in the VM
itself present a more clear case for a hard fork: the smart contract
was defined correctly, as everyone can see directly on the chain, but
illegal transactions were incorrectly marked as valid.
# Virtual Machine API
From the perspective of other components of `blockstack-core`, the
smart contracting VM will provide the following interface:
```
connect-to-database(db)
publish-contract(
contract-source-code)
returns: contract-identifier
execute-contract(
contract-identifier,
transaction-name,
sender-principal,
transaction-arguments)
returns: true or false if the transaction executed successfully
```
## Invocation and Static Analysis
When processing a client transaction, a `blockstack-core` node will do
one of two things, depending on whether that transaction is a contract
function invocation, or is attempting to publish a new smart contract.
### Contract function invocation
Any transaction which invokes a smart contract will be included in the
blockchain. This is true even for transactions which are
_invalid_. This is because _validating_ an invalid transaction is not
a free operation. The only exceptions to this are transactions which
do not pay more than either a minimum fee or a storage fee
corresponding to the length of the transaction. Transactions which do
not pay a storage fee and clear the minimum transaction fee are
dropped from the mempool.
To process a function invocation, `blockstack-core` does the following:
1. Get the balance of the sender's account. If it's less than the tx fee,
then `RETURN INVALID`.
2. Otherwise, debit the user's account by the tx fee.
3. Look up the contract by hash. If it does not exist, then `RETURN
INVALID`.
4. Look up the contract's `define-public` function and compare the
tx's arguments against it. If the tx does not call an existing
method, or supplies invalid arguments, then `RETURN INVALID`.
5. Look up the cost to execute the given function, and if it is greater
than the paid tx fee, `RETURN INVALID`.
6. Execute the public function code and commit the effects of running
the code and `RETURN OK`
### Publish contract
A transaction which creates a new smart contract must pay a fee which
funds the static analysis required to determine the cost of the new
smart contract's public functions. To process such a transaction,
`blockstack-core` will:
1. Check the sender's account balance. If zero, then `RETURN INVALID`
2. Check the tx fee against the user's balance. If it's higher, then `RETURN INVALID`
3. Debit the tx fee from the user's balance.
4. Check the syntax, calculating the fee of verifying each code
item. If the cost of checking the next item exceeds the tx fee, or
if the syntax is invalid, then `RETURN INVALID`.
5. Build the AST, and assign a fee for adding each AST item. If the
cost of adding the next item to the tree exceeds the tx fee (or if
the AST gets too big), then `RETURN INVALID`.
6. Walk the AST. Each step in the walk incurs a small fee. Do the
following while the tx fee is higher than the total cost incurred
by walking to the next node in the AST:
a. If the next node calls a contract method, then verify that
the contract exists and the method arguments match the contract's
`define-public` signature. If not, then `RETURN INVALID`.
b. Compute the runtime cost of each node in the AST, adding it
to the function's cost analysis.
7. Find all `define-map` calls to find all tables that need to
exist. Each step in this incurs a small fee.
8. Create all the tables if the cost of creating them is smaller than
the remaining tx fee. If not, then RETURN INVALID.
9. `RETURN OK`
## Database Requirements and Transaction Accounting
The smart contract VM needs to interact with a database somewhat
directly: the effects of an `map-insert!` or `map-set!` call are
realized later in the execution of the same transaction. The database
will need to support fairly fine-grained rollbacks as some contract
calls within a transaction's execution may fail, triggering a
rollback, while the transaction execution continues and successfully
completes other database operations.
The database API provided to the smart contract VM, therefore, must be
capable of (1) quickly responding to `map-get` queries, which are
essentially simply key-value _gets_ on the materialized view of the
operation log. The operation log itself is simply a log of the
`map-insert!` and `map-set!` calls. In addition to these
operations, the smart contract VM will be making token transfer calls.
The databasse log should track those operations as well.
In order to aid in accounting for the database operations created by a
given transaction, the underlying database should store, with each
operation entry, the corresponding transaction identifier. This will
be expanded in a future SIP to require the database to store enough
information to reconstruct each block, such that the blocks can be
relayed to bootstrapping peers.
# Clarity Type System
## Types
The Clarity language uses a strong static type system. Function arguments
and database schemas require specified types, and use of types is checked
during contract launch. The type system does _not_ have a universal
super type. The type system contains the following types:
* `(tuple (key-name-0 key-type-0) (key-name-1 key-type-1) ...)` -
a typed tuple with named fields.
* `(list max-len entry-type)` - a list of maximum length `max-len`, with
entries of type `entry-type`
* `(response ok-type err-type)` - object used by public functions to commit
their changes or abort. May be returned or used by other functions as
well, however, only public functions have the commit/abort behavior.
* `(optional some-type)` - an option type for objects that can either be
`(some value)` or `none`
* `(buff max-len)` := byte buffer or maximum length `max-len`.
* `principal` := object representing a principal (whether a contract principal
or standard principal).
* `bool` := boolean value (`true` or `false`)
* `int` := signed 128-bit integer
* `uint` := unsigned 128-bit integer
## Type Admission
**UnknownType**. The Clarity type system does not allow for specifying
an "unknown" type, however, in type analysis, unknown types may be
constructed and used by the analyzer. Such unknown types are used
_only_ in the admission rules for `response` and `optional` types
(i.e., the variant types).
Type admission in Clarity follows the following rules:
* Types will only admit objects of the same type, i.e., lists will only
admit lists, tuples only admit tuples, bools only admit bools.
* A tuple type `A` admits another tuple type `B` iff they have the exact same
key names, and every key type of `A` admits the corresponding key type of `B`.
* A list type `A` admits another list type `B` iff `A.max-len >= B.max-len` and
`A.entry-type` admits `B.entry-type`.
* A buffer type `A` admits another buffer type `B` iff `A.max-len >= B.max-len`.
* An optional type `A` admits another optional type `B` iff:
* `A.some-type` admits `B.some-type` _OR_ `B.some-type` is an unknown type:
this is the case if `B` only ever corresponds to `none`
* A response type `A` admits another response type `B` if one of the following is true:
* `A.ok-type` admits `B.ok-type` _AND_ `A.err-type` admits `B.err-type`
* `B.ok-type` is unknown _AND_ `A.err-type` admits `B.err-type`
* `B.err-type` is unknown _AND_ `A.ok-type` admits `B.ok-type`
* Principals, bools, ints, and uints only admit types of the exact same type.
Type admission is used for determining whether an object is a legal argument for
a function, or for insertion into the database. Type admission is _also_ used
during type analysis to determine the return types of functions. In particular,
a function's return type is the least common supertype of each type returned from any
control path in the function. For example:
```
(define-private (if-types (input bool))
(if input
(ok 1)
(err false)))
```
The return type of `if-types` is the least common supertype of `(ok
1)` and `(err false)` (i.e., the most restrictive type that contains
all returns). In this case, that type `(response int bool)`. Because
Clarity _does not_ have a universal supertype, it may be impossible to
determine such a type. In these cases, the functions are illegal, and
will be rejected during type analysis.
# Measuring Transaction Costs for Fee Collection
Our smart contracting language admits static analysis to determine
many properties of transactions _before_ executing those
transactions. In particular, it allows for the VM to count the total
number of runtime operations required, the maximum amount of database
writes, and the maximum number of calls to any expensive primitive
functions like database reads or hash computations. Translating that
information into transaction costs, however, requires more than simply
counting those operations. It requires translating the operations into
a single cost metric (something like gas in Ethereum). Then, clients
can set the fee rate for that metric, and pay the corresponding
transaction fee. Notably, unlike Turing-complete smart contracting
languages, any such fees are known _before_ executing the transaction,
such that clients will no longer need to estimate gas fees. They will,
however, still need to estimate fee rates (much like Bitcoin clients
do today).
Developing such a cost metric is an important task that has
significant consequences. If the metric is a bad one, it could open up
the possibility of denial-of-service attacks against nodes in the
Stacks network. We leave the development of a cost metric to another
Stacks Improvement Proposal, as we believe that such a metric should
be designed by collecting real benchmarking data from something close
to a real system (such measurements will likely be collected through
a combination of hand-crafted benchmarks and fuzzing test suites).
### Maximum Operation Costs and Object Sizes
Even with a cost metric, it is a good idea to set maximums for the
cost of an operation, and the size of objects (like
buffers). Developing good values for constants such as maximum number
of database reads or writes per transaction, maximum size of buffers,
maximum number of arguments to a tuple, maximum size of a smart
contract definition, etc. is a process much like developing a
cost metric--- this is something best done in tandem with the
production of a prototype. However, we should note that we do intend
to set such limits.
# Example: Simple Naming System
To demonstrate the expressiveness of this smart contracting language,
let's look at an example smart contract which implements a simple
naming system with just two kinds of transactions: _preorder_ and
_register_. The requirements of the system are as follows:
1. Names may only be owned by one principal
2. A register is only allowed if there is a corresponding preorder
with a matching hash
3. A register transaction must be signed by the same principal who
paid for the preorder
4. A preorder must have paid at least the price of the name. Names
are represented as integers, and any name less than 100000 costs
1000 microstacks, while all other names cost 100 microstacks.
5. Preorder hashs are _globally_ unique.
In this simple scheme, names are represented by integers, but in
practice, a buffer would probably be used.
```scheme
(define-constant burn-address '1111111111111111111114oLvT2)
(define-private (price-function name)
(if (< name 1e5) 1000 100))
(define-map name-map
{ name: uint } { buyer: principal })
(define-map preorder-map
{ name-hash: (buff 160) }
{ buyer: principal, paid: uint })
(define-public (preorder
(name-hash (buffer 20))
(name-price integer))
(if (and (is-ok? (stacks-transfer!
name-price burn-address))
(map-insert! preorder-map
(tuple (name-hash name-hash))
(tuple (paid name-price)
(buyer tx-sender))))
(ok 0)
(err 1)))
(define-public (register
(recipient-principal principal)
(name integer)
(salt integer))
(let ((preorder-entry
(map-get preorder-map
(tuple (name-hash (hash160 name salt)))))
(name-entry
(map-get name-map (tuple (name name)))))
(if (and
;; must be preordered
(not (is-none? preorder-entry))
;; name shouldn't *already* exist
(is-none? name-entry)
;; preorder must have paid enough
(<= (price-funcion name)
(default-to 0 (get paid preorder-entry)))
;; preorder must have been the current principal
(eq? tx-sender
(expects! (get buyer preorder-entry) (err 1)))
(map-insert! name-table
(tuple (name name))
(tuple (owner recipient))))
(ok 0)
(err 1))))
```
Note that Blockstack PBC intends to supply a full BNS (Blockstack
Naming System) smart contract, as well as formal proofs that certain
desirable properties hold (e.g. "names are globally unique", "a
revoked name cannot be updated or transferred", "names cost stacks
based on their namespace price function", "only the principal can
reveal a name on registration", etc.).
This SIP is now located in the [stacksgov/sips repository](https://github.com/stacksgov/sips/blob/main/sips/sip-002/sip-002-smart-contract-language.md) as part of the [Stacks Community Governance organization](https://github.com/stacksgov).

File diff suppressed because it is too large Load Diff

View File

@@ -1,750 +1,5 @@
# SIP 004 Cryptographic Committment to Materialized Views
# SIP-004 Cryptographic Committment to Materialized Views
## Preamble
This document formerly contained SIP-004 before the Stacks 2.0 mainnet launched.
Title: Cryptograhpic Commitment to Materialized Views
Author: Jude Nelson <jude@blockstack.com>
Status: Draft
Type: Standard
Created: 7/15/2019
License: BSD 2-Clause
## Abstract
Blockchain peers are replicated state machines, and as such, must maintain a
materialized view of all of the state the transaction log represents in order to
validate a subsequent transaction. The Stacks blockchain in particular not only
maintains a materialized view of the state of every fork, but also requires
miners to cryptographically commit to that view whenever they mine a block.
This document describes a **Merklized Adaptive Radix Forest** (MARF), an
authenticated index data structure for efficiently encoding a
cryptographic commitment to blockchain state.
The MARF's structure is part of the consensus logic in the Stacks blockchain --
every Stacks peer must process the MARF the same way. Stacks miners announce
a cryptographic hash of their chain tip's MARF in the blocks they produce, and in
doing so, demonstrate to each peer and each light client that they have
applied the block's transactions to the peer's state correctly.
The MARF represents blockchain state as an authenticated directory. State is
represented as key/value pairs. The MARF structure gives a peer the ability to
prove to a light client that a particular key has a particular value, given the
MARF's cryptographic hash. The proof has _O(log B)_ space for _B_ blocks, and
takes _O(log B)_ time complexity to produce and verify. In addition, it offers
_O(1)_ expected time and space complexity for inserts and queries.
The MARF proof allows a light client to determine:
* What the value of a particular key is,
* How much cumulative energy has been spent to produce the key/value pair,
* How many confirmations the key/value pair has.
## Rationale
In order to generate a valid transaction, a blockchain client needs to be able
to query the current state of the blockchain. For example, in Bitcoin, a client
needs to query its unspent transaction outputs (UTXOs) in order to satisfy their
spending conditions in a new transaction. As another example, in Ethereum, a
client needs to query its accounts' current nonces in order to generate a valid
transaction to spend their tokens.
Whether or not a blockchain's peers are required to commit to the current state
in the blocks themselves (i.e. as part of the consensus logic) is a
philosophical decision. We argue that it is a highly desirable in Blockstack's
case, since it affords light clients more security when querying the blockchain state than
not. This is because a client often queries state that was last updated several
blocks in the past (i.e. and is "confirmed"). If a blockchain peer can prove to
a client that a particular key in the state has a particular value, and was last
updated a certain number of blocks in the past, then the client can determine
whether or not to trust the peer's proof based on factors beyond simply trusting
the remote peer to be honest. In particular, the client can determine how
difficult it would be to generate a dishonest proof, in terms of the number of
blocks that would need to be maliciously crafted and accepted by the network.
This offers clients some protection against peers that would lie to them -- a
lying peer would need to spend a large amount of energy (and money) in order to
do so.
Specific to Blockstack, we envision that many applications will run
their own Stacks-based blockchain peer networks that operate "on top" of the
Stacks blockchain through proof-of-burn. This means that the Blockstack
application ecosystem will have many parallel "app chains" that users may wish
to interact with. While a cautious power user may run validator nodes for each
app chain they are interested in, we expect that most users will not do so,
especially if they are just trying out the application or are casual users. In
order to afford these users better security than simply telling them to find a
trusted validating peer, it is essential that each Stacks peer commits to its
materialized view in each block.
On top of providing better security to light clients, committing to the materialized
state view in each block has the additional benefit of helping the peer network
detect malfunctioning miners early on. A malfunctioning miner will calculate a
different materialized view using the same transactions, and with overwhelmingly
high probability, will also calculate a different state view hash. This makes
it easy for a blockchain's peers to reject a block produced in this manner
outright, without having to replay its transactions.
### Design Considerations
Committing to the materialized view in each block has a non-zero cost in terms
of time and space complexity. Given that Stacks miners use PoW to increase
their chances of winning a block race, the time required to calculate
the materialized view necessarily cuts into the time
required to solve the PoW puzzle -- it is part of the block validation logic.
While this is a cost borne by each miner, the fact that PoW mining is a zero-sum game
means that miners that are able to calculate the materialized view the fastest will have a
better chance of winning a block race than those who do not. This means that it
is of paramount importance to keep the materialized view digest calculation as
fast as possible, just as it is of paramount importance to make block
validation as fast and cheap as possible.
The following considerations have a non-trivial impact on the design of the
MARF:
**A transaction can read or write any prior state in the same fork.** This
means that the index must support fast random-access reads and fast
random writes.
**The Stacks blockchain can fork, and a miner can produce a fork at any block
height in the past.** As argued in SIP 001, a Stacks blockchain peer must process
all forks and keep their blocks around. This also means that a peer needs to
calculate and validate the materialized view of each fork, no matter where it
occurs. This is also necessary because a client may request a proof for some
state in any fork -- in order to service such requests, the peer must calculate
the materialized view for all forks.
**Forks can occur in any order, and blocks can arrive in any order.** As such,
the runtime cost of calculating the materialized view must be _independent_ of the
order in which forks are produced, as well as the order in which their blocks
arrive. This is required in order to avoid denial-of-service vulnerabilities,
whereby a network attacker can control the schedules of both
forks and block arrivals in a bid to force each peer to expend resources
validating the fork. It must be impossible for an attacker to
significantly slow down the peer network by maliciously varying either schedule.
This has non-trivial consequences for the design of the data structures for
encoding materialized views.
## Specification
The Stacks peer's materialized view is realized as a flat key/value store.
Transactions encode zero or more creates, inserts, updates, and deletes on this
key/value store. As a consequence of needing to support forks from any prior block,
no data is ever removed; instead, a "delete" on a particular key is encoded
by replacing the value with a tombstone record. The materialized view is the
subset of key/value pairs that belong to a particular fork in the blockchain.
The Stacks blockchain separates the concern of maintaining _authenticated
index_ over data from storing a copy of the data itself. The blockchain peers
commit to the digest of the authenticated index, but can store the data however
they want. The authenticated index is realized as a _Merklized Adaptive Radix
Forest_ (MARF). The MARF gives Stacks peers the ability to prove that a
particular key in the materialized view maps to a particular value in a
particular fork.
A MARF has two principal data structures: a _merklized adaptive radix trie_
for each block and a _merklized skip-list_ that
cryptographically links merklized adaptive radix tries in prior blocks to the
current block.
### Merklized Adaptive Radix Tries (ARTs)
An _adaptive radix trie_ (ART) is a prefix tree where each node's branching
factor varies with the number of children. In particular, a node's branching
factor increases according to a schedule (0, 4, 16, 48, 256) as more and more
children are added. This behavior, combined with the usual sparse trie
optimizations of _lazy expansion_ and _path compression_, produce a tree-like
index over a set key/value pairs that is _shallower_ than a perfectly-balanced
binary search tree over the same values. Details on the analysis of ARTs can
be found in [1].
To produce an _index_ over new state introduced in this block, the Stacks peer
will produce an adaptive radix trie that describes each key/value pair modified.
In particular, for each key affected by the block, the Stacks peer will:
* Calculate the hash of the key to get a fixed-length trie path,
* Store the new value and this hash into its data store,
* Insert or update the associated value hash in the block's ART at the trie path,
* Calculate the new Merkle root of the ART by hashing all modified intermediate
nodes along the path.
In doing so, the Stacks peer produces an authenticated index for all key/value
pairs affected by a block. The leaves of the ART are the hashes of the values,
and the hashes produced in each intermediate node and root give the peer a
way to cryptographically prove that a particular value is present in the ART
(given the root hash and the key).
The Stacks blockchain employs _path compression_ and _lazy expansion_
to efficiently represent all key/value pairs while minimizing the number of trie
nodes. That is, if two children share a common prefix, the prefix bytes are
stored in a single intermediate node instead of being spread across multiple
intermediate nodes (path compression). In the special case where a path suffix
uniquely identifies the leaf, the path suffix will be stored alongside the leaf
instead as a sequence of intermediate nodes (lazy expansion). As more and more
key/value pairs are inserted, intermediate nodes and leaves with multi-byte
paths will be split into more nodes.
**Trie Structure**
A trie is made up of nodes with radix 4, 16, 48, or 256, as well as leaves. In
the documentation below, these are called `node4`, `node16`, `node48`,
`node256`, and `leaf` nodes. An empty trie has a single `node256` as its root.
Child pointers occupy one byte.
**Notation**
The notation `(ab)node256` means "a `node256` who descends from its parent via
byte 0xab".
The notation `node256[path=abcd]` means "a `node256` that has a shared prefix
with is children `abcd`".
**Lazy Expansion**
If a leaf has a non-zero-byte path suffix, and another leaf is inserted that
shares part of the suffix, the common bytes will be split off of the existing
leaf to form a `node4`, whose two immediate children are the two leaves. Each
of the two leaves will store the path bytes that are unique to them. For
example, consider this trie with a root `node256` and a single leaf, located at
path `aabbccddeeff00112233` and having value hash `123456`:
```
node256
\
(aa)leaf[path=bbccddeeff00112233]=123456
```
If the peer inserts the value hash `98765` at path `aabbccddeeff998877`, the
single leaf's path will be split into a shared prefix and two distinct suffixes,
as follows:
```
insert (aabbccddeeff998877, 98765)
node256 (00)leaf[path=112233]=123456
\ /
(aa)node4[path-bbccddeeff]
\
(99)leaf[path=887766]=98765
```
Now, the trie encodes both `aabbccddeeff00112233=123456` and
`aabbccddeeff99887766=98765`.
**Node Promotion**
As a node with a small radix gains children, it will eventually need to be
promoted to a node with a higher radix. A `node4` will become a `node16` when
it receives its 5th child; a `node16` will become a `node48` when it receives
its 17th child, and a `node48` will become a `node256` when it receives its 49th
child. A `node256` will never need to be promoted, because it has slots for
child pointers with all possible byte values.
For example, consider this trie with a `node4` and 4 children:
```
node256 (00)leaf[path=112233]=123456
\ /
\ / (01)leaf[path=445566]=67890
\ / /
(aa)node4[path=bbccddeeff]---
\ \
\ (02)leaf[path=778899]=abcdef
\
(99)leaf[path=887766]=98765
```
This trie encodes the following:
* `aabbccddeeff00112233=123456`
* `aabbccddeeff01445566=67890`
* `aabbccddeeff02778899=abcdef`
* `aabbccddeeff99887766=9876`
Inserting one more node with a prefix `aabbccddeeff` will promote the
intermediate `node4` to a `node16`:
```
insert (aabbccddeeff03aabbcc, 314159)
node256 (00)leaf[path=112233]=123456
\ /
\ / (01)leaf[path=445566]=67890
\ / /
(aa)node16[path=bbccddeeff]-----(03)leaf[path=aabbcc]=314159
\ \
\ (02)leaf[path=778899]=abcdef
\
(99)leaf[path=887766]=98765
```
The trie now encodes the following:
* `aabbccddeeff00112233=123456`
* `aabbccddeeff01445566=67890`
* `aabbccddeeff03aabbcc=314159`
* `aabbccddeeff02778899=abcdef`
* `aabbccddeeff99887766=9876`
**Path Compression**
Intermediate nodes, such as the `node16` in the previous example, store path
prefixes shared by all of their children. If a node is inserted that shares
some of this prefix, but not all of it, the path is "decompressed" -- a new
leaf is "spliced" into the compressed path, and attached to a `node4` whose two
children are the leaf and the existing node (i.e. the `node16` in this case)
whose shared path now contains the suffix unique to its children, but distinct
from the newly-spliced leaf.
For example, consider this trie with the intermediate `node16` sharing a path
prefix `bbccddeeff` with its 5 children:
```
node256 (00)leaf[path=112233]=123456
\ /
\ / (01)leaf[path=445566]=67890
\ / /
(aa)node16[path=bbccddeeff]-----(03)leaf[path=aabbcc]=314159
\ \
\ (02)leaf[path=778899]=abcdef
\
(99)leaf[path=887766]=98765
```
This trie encodes the following:
* `aabbccddeeff00112233=123456`
* `aabbccddeeff01445566=67890`
* `aabbccddeeff03aabbcc=314159`
* `aabbccddeeff02778899=abcdef`
* `aabbccddeeff99887766=9876`
If we inserted `(aabbcc001122334455, 21878)`, the `node16`'s path would be
decompressed to `eeff`, a leaf with the distinct suffix `1122334455` would be spliced
in via a `node4`, and the `node4` would have the shared path prefix `bbcc` with
its now-child `node16` and leaf.
```
insert (aabbcc00112233445566, 21878)
(00)leaf[path=112233445566]=21878
/
node256 / (00)leaf[path=112233]=123456
\ / /
(aa)node4[path=bbcc] / (01)leaf[path=445566]=67890
\ / /
(dd)node16[path=eeff]-----(03)leaf[path=aabbcc]=314159
\ \
\ (02)leaf[path=778899]=abcdef
\
(99)leaf[path=887766]=98765
```
The resulting trie now encodes the following:
* `aabbcc00112233445566=21878`
* `aabbccddeeff00112233=123456`
* `aabbccddeeff01445566=67890`
* `aabbccddeeff03aabbcc=314159`
* `aabbccddeeff02778899=abcdef`
* `aabbccddeeff99887766=9876`
### Back-pointers
The materialized view of a fork will hold key/value pairs for data produced by
applying _all transactions_ in that fork, not just the ones in the last block. As such,
the index over all key/value pairs in a fork is encoded in the sequence of
its block's merklized ARTs.
To ensure that random reads and writes on the a fork's materialized view remain
fast no matter which block added them, a child pointer in an ART can point to
either a node in the same ART, or a node with the same path in a prior ART. For
example, if the ART at block _N_ has a `node16` whose path is `aabbccddeeff`, and 10
blocks ago a leaf was inserted at path `aabbccddeeff99887766`, it will
contain a child pointer to the intermediate node from 10 blocks ago whose path is
`aabbccddeeff` and who has a child node in slot `0x99`. This information is encoded
as a _back-pointer_. To see it visually:
```
At block N
node256 (00)leaf[path=112233]=123456
\ /
\ / (01)leaf[path=445566]=67890
\ / /
(aa)node16[path=bbccddeeff]-----(03)leaf[path=aabbcc]=314159
\ \
\ (02)leaf[path=778899]=abcdef
\
|
|
|
At block N-10 - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - -
|
node256 | /* back-pointer to block N-10 */
\ |
\ |
\ |
(aa)node4[path=bbccddeeff] |
\ |
\ |
\ |
(99)leaf[path=887766]=98765
```
By maintaining trie child pointers this way, the act of looking up a path to a value in
a previous block is a matter of following back-pointers to previous tries.
This back-pointer uses the _block-hash_ of the previous block to uniquely identify
the block. In order to keep the in-memory and on-disk representations of trie nodes succint,
the MARF structure uses a locally defined unsigned 32-bit integer to identify the previous
block, along with a local mapping of such integers to the respective block header hash.
Back-pointers are calculated in a copy-on-write fashion when calculating the ART
for the next block. When the root node for the ART at block N+1 is created, all
of its children are set to back-pointers that point to the immediate children of
the root of block N's ART. Then, when inserting a key/value pair, the peer
walks the current ART to the insertion point, but whenever a
back-pointer is encountered, it copies the node it points to into the current
ART, and sets all of its non-empty child pointers to back-pointers. The peer
then continues traversing the ART until the insertion point is found (i.e. a
node has an unallocated child pointer where the leaf should go), copying
over intermediate nodes lazily.
For example, consider the act of inserting `aabbccddeeff00112233=123456` into an
ART where a previous ART contains the key/value pair
`aabbccddeeff99887766=98765`:
```
At block N
node256 (00)leaf[path=112233]=123456
^ \ /
| \ /
| \ /
| (aa)node4[path=bbccddeeff]
| ^ \
| | \
| /* 1. @root. */ | /* 2. @node4. */ \ /* 3. 00 is empty, so insert */
| /* copy up, &*/ | /* copy up, & */ |
| /* make back-*/ | /* make back- */ |
| /* ptr to aa */ | /* ptr to 99 */ |
| | |
|- At block N-10 -|- - - - - - - - - - | - - - - - - - - - - - - - - - - - -
| | |
node256 | |
\ | |
\ | |
\ | |
(aa)node4[path=bbccddeeff] |
\ |
\ |
\|
(99)leaf[path=887766]=98765
```
In step 1, the `node256` in block _N_ would have a back-pointer to the `node4` in
block _N - 10_ in child slot `0xaa`. While walking path `aabbccddeeff00112233`,
the peer would follow slot `0xaa` to the `node4` in block _N - 10_ and copy it
into block _N_, and would set its child pointer at `0x99` to be a back-pointer
to the `leaf` in block _N - 10_. It would then step to the `node4` it copied,
and walk path bytes `bbccddeeff`. When it reaches child slot `0x00`, the peer
sees that it is unallocated, and attaches the leaf with the unexpanded path
suffix `112233`. The back-pointer to `aabbccddeeff99887766=98765` is thus
preserved in block _N_'s ART.
**Calculating the Root Hash with Back-pointers**
For reasons that will be explained in a moment, the hash of a child node that is a
back-pointer is not calculated the usual way when calculating the root hash of
the Merklized ART. Instead of taking the hash of the child node (as would be
done for a child in the same ART), the hash of the _block header_ is used
instead. In the above example, the hash of the `leaf` node whose path is
`aabbccddeeff99887766` would be the hash of block _N - 10_'s header, whereas the
hash of the `leaf` node whose path is `aabbccddeeff00112233` would be the hash
of the value hash `123456`.
The main reason for doing this is to keep block validation time down by a
significant constant factor. The block header hash is always kept in RAM,
but at least one disk seek is required to read the hash of a child in a separate
ART (and it often takes more than one seek). This does not sacrifice the security
of a Merkle proof of `aabbccddeeff99887766=98765`, but it does alter the mechanics
of calculating and verifying it.
### Merklized Skip-list
The second principal data structure in a MARF is a Merklized skip-list encoded
from the block header hashes and ART root hashes in each block. The hash of the
root node in the ART for block _N_ is derived not only from the hash of the
root's children, but also from the hashes of the block headers from blocks
`N - 1`, `N - 2`, `N - 4`, `N - 8`, `N - 16`, and so on. This constitutes
a _Merklized skip-list_ over the sequence of ARTs.
The reason for encoding the root node's hash this way is to make it possible for
peers to create a cryptographic proof that a particular key maps to a particular
value when the value lives in a prior block, and can only be accessed by
following one or more back-pointers. In addition, the Merkle skip-list affords
a client _two_ ways to verify key-value pairs: the client only needs either (1)
a known-good root hash, or (2) the sequence of block headers for the Stacks
chain and its underlying burn chain. Having (2) allows the client to determine
(1), but calculating (2) is expensive for a client doing a small number of
queries. For this reason, both options are supported.
#### Resolving Block Height Queries
For a variety of reasons, the MARF structure must be able to resolve
queries mapping from block heights (or relative block heights) to
block header hashes and vice-versa --- for example, the Clarity VM
allows contracts to inspect this information. Most applicable to the
MARF, though, is that in order to find the ancestor hashes to include
in the Merklized Skip-list, the data structure must be able to find
the block headers which are 1, 2, 4, 8, 16, ... blocks prior in the
same fork. This could be discovered by walking backwards from the
current block, using the previous block header to step back through
the fork's history. However, such a process would require _O(N)_ steps
(where _N_ is the current block height). But, if a mapping exists for
discovering the block at a given block height, this process would instead
be _O(1)_ (because a node will have at most 32 such ancestors).
But correctly implementing such a mapping is not trivial: a given
height could resolve to different blocks in different forks. However,
the MARF itself is designed to handle exactly these kinds of
queries. As such, at the beginning of each new block, the MARF inserts
into the block's trie two entries:
1. This block's block header hash -> this block's height.
2. This block's height -> this block's block header hash.
This mapping allows the ancestor hash calculation to proceed.
### MARF Merkle Proofs
A Merkle proof for a MARF is constructed using a combination of two types of
sub-proofs: _segment proofs_, and _shunt proofs_. A _segment proof_ is a proof
that a node belongs to a particular Merklized ART. It is simply a Merkle tree
proof. A _shunt proof_ is a proof that the ART for block _N_ is exactly _K_
blocks away from the ART at block _N - K_. It is generated as a Merkle proof
from the Merkle skip-list.
Calculating a MARF Merkle proof is done by first calculating a segment proof for a
sequence of path prefixes, such that all the nodes in a single prefix are in the
same ART. To do so, the node walks from the current block's ART's root node
down to the leaf in question, and each time it encounters a back-pointer, it
generates a segment proof from the _currently-visited_ ART to the intermediate
node whose child is the back-pointer to follow. If a path contains _i_
back-pointers, then there will be _i+1_ segment proofs.
Once the peer has calculated each segment proof, it calculates a shunt proof
that shows that the _i+1_th segment was reached by walking back a given number
of blocks from the _i_th segment by following the _i_th segment's back-pointer.
The final shunt proof for the ART that contains the leaf node includes all of
the prior block header hashes that went into producing its root node's hash.
Each shunt proof is a sequence of sequences of block header hashes and ART root
hashes, such that the hash of the next ART root node can be calculated from the
previous sequence.
For example, consider the following ARTs:
```
At block N
node256 (00)leaf[path=112233]=123456
\ /
\ / (01)leaf[path=445566]=67890
\ / /
(aa)node16[path=bbccddeeff]-----(03)leaf[path=aabbcc]=314159
\ \
\ (02)leaf[path=778899]=abcdef
\
|
|
|
At block N-10 - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - -
|
node256 | /* back-pointer to N - 10 */
\ |
\ |
\ |
(aa)node4[path=bbccddeeff] |
\ |
\ |
\ |
(99)leaf[path=887766]=98765
```
To generate a MARF Merkle proof, the client queries a Stacks peer for a
particular value hash, and then requests the peer generate a proof that the key
and value must have been included in the calculation of the current block's ART
root hash (i.e. the digest of the materialized view of this fork).
For example, given the key/value pair `aabbccddeeff99887766=98765` and the hash
of the ART at block _N_, the peer would generate two segment proofs for the
following paths: `aabbccddeeff` in block _N_, and `aabbccddeeff99887766` in
block `N - 10`.
```
At block N
node256
\ /* this segment proof would contain the hashes of all other */
\ /* children of the root, except for the one at 0xaa. */
\
(aa)node16[path=bbccddeeff]
At block N-10 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
node256 /* this segment proof would contain two sequences of hashes: */
\ /* the hashes for all children of the root besides 0xaa, and */
\ /* the hashes of all children of the node4, except 0x99. */
\
(aa)node4[path=bbccddeeff]
\
\
\
(99)leaf[path=887766]=98765
```
Then, it would calculate two shunt proofs. The first proof, called the "head shunt proof,"
supplies the sequence of block hashes for blocks _N - 11, N - 12, N - 14, N - 18, N - 26, ..._ and the
hash of the children of the root node of the ART for block _N - 10_. This lets the
client calculate the hash of the root of the ART at block _N - 10_. The second
shunt proof (and all subsequent shunt proofs, if there are more back-pointers to
follow) is comprised of the hashes that "went into" calculating the hashes on the
skip-list from the next segment proof's root hash.
In detail, the second shunt proof would have two parts:
* the block header hashes for block _N - 9_ _N - 12_, _N - 16_, _N - 24_, ...
* the block header hashes for _N - 1_, _N - 2_, _N - 4_, _N - 16_, _N - 32_, ...
The reason there are two sequences in this shunt proof is because "walking back"
from block _N_ to block _N - 10_ requires walking first to block _N - 8_ (i.e.
following the skip-list column for 2 ** 3), and then walking to block _N - 10_
from _N - 8_ (i.e. following its skip-list column for 2 ** 1). The first segment
proof (i.e. with the leaf) lets the client calculate the hash of the children of
the ART root node in block _N - 10_, which when combined with the first part of
this shunt proof yields the ART root hash for _N - 8_. Then, the client
uses the hash of the children of the root node in the ART of block _N_ (calculated from the second segment
proof), combined with the root hash from node _N - 8_ and with the hashes
in the second piece of this shunt proof, to calculate the ART root hash for
block _N_. The proof is valid if this calculated root hash matches the root
hash for which it requested the proof.
In order to fully verify the MARF Merkle proof, the client would verify that:
* The first segment proof's path's bytes are equal to the hash of the key for
which the proof was requested.
* The first segment proof ends in a leaf node, and the leaf node contains the
hash of the value for which the proof was requested.
* Each segment proof is valid -- the root hash could only be calculated from the
deepest intermediate node in the segment,
* Each subsequent segment proof was generated from a prefix of the path
represented by the current segment proof,
* Each back-pointer at the tail of each segment (except the one that terminates
in the leaf -- i.e. the first one) was a number of blocks back that is equal
to the number of blocks skipped over in the shunt proof linking it to the next
segment.
* Each block header was included in the fork the client is querying,
* Each block header was generated from its associated ART root hash,
* (Optional, but encouraged): The burn chain block headers demonstrate that the
correct difficulty rules were followed. This step can be skipped if the
client somehow already knows that the hash of block _N_ is valid.
Note that to verify the proof, the client would need to substitute the
_block header hash_ for each intermediate node at the tail of each segment
proof. The block header hash can either be obtained by fetching the block
headers for both the Stacks chain and burn chain _a priori_ and verifying that
they are valid, or by fetching them on-the-fly. The second strategy should only
be used if the client's root hash it submits to the peer is known out-of-band to
be the correct hash.
The security of the proof is similar to SPV proofs in Bitcoin -- the proof is
valid assuming the client is able to either verify that the final header hash
represents the true state of the network, or the client is able to fetch the
true burn chain block header sequence. The client has some assurance that a
_given_ header sequence is the _true_ header sequence, because the header
sequence encodes the proof-of-work that went into producing it. A header
sequence with a large amount of proof-of-work is assumed to be infeasible for an
attacker to produce -- i.e. only the majority of the burn chain's network hash
power could have produced the header chain. Regardless of which data the client
has, the usual security assumptions about confirmation depth apply -- a proof
that a key maps to a given value is valid only if the transaction that set
it is unlikely to be reversed by a chain reorg.
### Performance
The time and space complexity of a MARF is as follows:
* **Reads are _O(1)_** While reads may traverse multiple tries, they are always
descending the radix trie, and resolving back pointers is constant time.
* **Inserts and updates are _O(1)._** Inserts have the same complexity
as reads, though they require more work by constant factors (in
particular, hash recalculations).
* **Creating a new block is _O(log B)_.** Inserting a block requires
including the Merkle skip-list hash in the root node of the new
ART. This is _log B_ work, where _B_ is chain length.
* **Creating a new fork is _O(log B)_.** Forks do not incur any overhead relative
to appending a block to a prior chain-tip.
* **Generating a proof is _O(log B)_ for B blocks**. This is the cost of
reading a fixed number of nodes, combined with walking the Merkle skip-list.
* **Verifying a proof is _O(log B)_**. This is the cost of verifying a fixed
number of fixed-length segments, and verifying a fixed number of _O(log B)_
shunt proof hashes.
* **Proof size is _O(log B)_**. A proof has a fixed number of segment proofs,
where each node has a constant size. It has _O(log B)_ hashes across all of
its shunt proofs.
### Consensus Details
The hash function used to generate a path from a key, as well as the hash
function used to generate a node hash, is SHA2-512/256. This was chosen because
it is extremely fast on 64-bit architectures, and is immune to length extension
attacks.
The hash of an intermediate node is the hash over the following data:
* a 1-byte node ID,
* the sequence of child pointer data (dependent on the type of node),
* the 1-byte length of the path prefix this node contains,
* the 0-to-32-byte path prefix
A single child pointer contains:
* a 1-byte node ID,
* a 1-byte path character,
* the 32-byte block header hash of the pointed-to block
A `node4`, `node16`, `node48`, and `node256` each have an array of 4,
16, 48, and 256 child pointers each.
Children are listed in a `node4`, `node16`, and `node48`'s child pointer arrays in the
order in which they are inserted. While searching for a child in a `node4` or
`node16` requires a linear scan of the child pointer array, searching a `node48` is done
by looking up the child's index in its child pointer array using the
path character byte as an index into the `node48`'s 256-byte child pointer
index, and then using _that_ index to look up the child pointer. Children are
inserted into the child pointer array of a `node256` by using the 1-byte
path character as the index.
The disk pointer stored in a child pointer, as well as the storage mechanism for
mapping hashes of values (leaves in the MARF) to the values themselves, are both
unspecified by the consensus rules. Any mechanism or representation is
permitted.
## Implementation
The implementation is in Rust, and is about 4,400 lines of code. It stores each
ART in a separate file, where each ART file contains the hash of the previous
block's ART's root hash and the locally-defined block identifiers.
The implementation is crash-consistent. It builds up the ART for block _N_ in
RAM, dumps it to disk, and then `rename(2)`s it into place.
The implementation uses a Sqlite3 database to map values to their hashes. A
read on a given key will first pass through the ART to find hash(value), and
then query the Sqlite3 database for the value. Similarly, a write will first
insert hash(value) and value into the Sqlite3 database, and then insert
hash(key) to hash(value) in the MARF.
## References
[1] https://db.in.tum.de/~leis/papers/ART.pdf
This SIP is now located in the [stacksgov/sips repository](https://github.com/stacksgov/sips/blob/main/sips/sip-004/sip-004-materialized-view.md) as part of the [Stacks Community Governance organization](https://github.com/stacksgov).

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,611 +1,5 @@
# SIP-007: Stacking Consensus
# SIP-007 Stacking Consensus
# Preamble
This document formerly contained SIP-007 before the Stacks 2.0 mainnet launched.
Title: Stacking Consensus
Authors:
Muneeb Ali <muneeb@blockstack.com>,
Aaron Blankstein <aaron@blockstack.com>,
Michael J. Freedman <mfreed@cs.princeton.edu>,
Diwaker Gupta <diwaker@blockstack.com>,
Jude Nelson <jude@blockstack.com>,
Jesse Soslow <jesse@blockstack.com>,
Patrick Stanley <patrick@blockstack.com>
Status: Draft
Type: Standard
Created: 01/14/2020
# Abstract
This SIP proposes a new consensus algorithm, called Stacking, that
uses the proof-of-work cryptocurrency of an established blockchain to
secure a new blockchain. An economic benefit of the Stacking consensus
algorithm is that the holders of the new cryptocurrency can earn a
reward in a base cryptocurrency by actively participating in the
consensus algorithm.
This SIP proposes to change the mining mechanism of the Stacks
blockchain. [SIP-001](./sip-001-burn-election.md) introduced
proof-of-burn (PoB) where a base cryptocurrency is destroyed to
participate in mining of a new cryptocurrency. This proposal argues
that a new mining mechanism called proof-of-transfer (PoX) will be an
improvement over proof-of-burn.
With proof-of-transfer, instead of destroying the base cryptocurrency,
miners are required to distribute the base cryptocurrency to existing
holders of the new cryptocurrency who participate in the consensus
algorithm. Therefore, existing holders of the new cryptocurrency have
an economic incentive to participate, do useful work for the network,
and receive rewards.
Proof-of-transfer avoids burning of the base cryptocurrency which
destroys some supply of the base cryptocurrency. Stacking in general
can be viewed as a more "efficient" algorithm where instead of
destroying a valuable resource (like electricity or base
cryptocurrency), the valuable resource is distributed to holders of
the new cryptocurrency.
The SIP describes one potential implementation of the Stacking
consensus algorithm for the Stacks blockchain using Bitcoin as the
base cryptocurrency.
# Introduction
Consensus algorithms for public blockchains require computational or
financial resources to secure the blockchain state. Mining mechanisms
used by these algorithms are broadly divided into proof-of-work (PoW),
in which nodes dedicate computational resources, and proof-of-stake
(PoS), in which nodes dedicate financial resources. The intention
behind both proof-of-work and proof-of-stake is to make it practically
infeasible for any single malicious actor to have enough computational
power or ownership stake to attack the network.
With proof-of-work, a miner does some "work" that consumes electricity
and is rewarded with digital currency. The miner is, theoretically,
converting electricity and computing power into the newly minted
digital currency. Bitcoin is an example of this and is by far the
largest and most secure PoW blockchain.
With proof-of-stake, miners stake their holdings of a new digital
currency to participate in the consensus algorithm and bad behavior
can be penalized by "slashing" the funds of the miner. PoS requires
less energy/electricity to be consumed and can give holders of the new
cryptocurrency who participate in staking a reward on their holdings
in the new cryptocurrency.
In this SIP we introduce a new consensus algorithm called
Stacking. The Stacking consensus algorithm uses a new type of mining
mechanism called *proof-of-transfer* (PoX). With PoX, miners are not
converting electricity and computing power to newly minted tokens, nor
are they staking their cryptocurrency. Rather they use an existing PoW
cryptocurrency to secure a new, separate blockchain.
This SIP is currently a draft and proposes to change the mining
mechanism of the Stacks blockchain from proof-of-burn (SIP-001) to
proof-of-transfer.
The PoX mining mechanism is a modification of proof-of-burn (PoB)
mining (See
the [Blockstack Technical Whitepaper](https://blockstack.org/papers)
and [SIP-001](./sip-001-burn-election.md)). In
proof-of-burn mining, miners burn a base cryptocurrency to participate
in mining — effectively destroying the base cryptocurrency to mint
units of a new cryptocurrency. **In proof-of-transfer, rather than
destroying the base cryptocurrency, miners transfer the base
cryptocurrency as a reward to owners of the new cryptocurrency**. In
the case of the Stacks blockchain, miners would transfer Bitcoin to
owners of Stacks tokens in order for miners to receive newly-minted
Stacks tokens. The security properties of proof-of-transfer are
comparable to proof-of-burn.
# Stacking with Bitcoin
In the Stacking consensus protocol, we require the base cryptocurrency
to be a proof-of-work blockchain. In this proposed implementation of
Stacking we assume that the PoW crypto-currency is Bitcoin, given it
is by far the most secure PoW blockchain. Theoretically, other PoW
blockchains can be used, but the security properties of Bitcoin are
currently superior to other PoW blockchains.
As with PoB, in PoX, the protocol selects the winning miner (*i.e.*,
the leader) of a round using a verifiable random function (VRF). The
leader writes the new block of the Stacks blockchain and mints the
rewards (newly minted Stacks). However, instead of bitcoins being sent
to burn addresses, the bitcoins are sent to a set of specific
addresses corresponding to Stacks (STX) tokens holders that are adding
value to the network. Thus, rather than being destroyed, the bitcoins
consumed in the mining process go to productive Stacks holders as a
reward based on their holdings of Stacks and participation in the
Stacking algorithm.
# Stacking Consensus Algorithm
In addition to the normal tasks of PoB mining
(see [SIP-001](./sip-001-burn-election.md)), the Stacking consensus
algorithm *must* determine the set of addresses that miners may
validly transfer funds to. PoB mining does not need to perform these
steps, because the address is always the same — the burn
address. However, with Stacking, network participants must be able to
validate the addresses that are sent to.
Progression in Stacking consensus happens over *reward cycles*. In
each reward cycle, a set of Bitcoin addresses are iterated over, such
that each Bitcoin address in the set of reward addresses has exactly
one Bitcoin block in which miners will transfer funds to the reward
address.
To qualify for a reward cycle, an STX holder must:
* Control a Stacks wallet with >= 0.02% of the total share of unlocked
Stacks tokens (currently, there are ~470m unlocked Stacks tokens,
meaning this would require ~94k Stacks). This threshold level
adjusts based on the participation levels in the Stacking protocol.
* Broadcast a signed message before the reward cycle begins that:
* Locks the associated Stacks tokens for a protocol-specified
lockup period.
* Specifies a Bitcoin address to receive the funds.
* Votes on a Stacks chain tip.
Miners participating in the Stacks blockchain compete to lead blocks
by transferring Bitcoin. Leaders for particular Stacks blocks are
chosen by sortition, weighted by the amount of Bitcoin sent (see
SIP-001). Before a reward cycle begins, the Stacks network must reach
consensus on which addresses are valid recipients. Reaching consensus
on this is non-trivial: the Stacks blockchain itself has many
properties independent from the Bitcoin blockchain, and may experience
forks, missing block data, etc., all of which make reaching consensus
difficult. As an extreme example, consider a miner that forks the
Stacks chain with a block that claims to hold a large fraction (e.g.,
100%) of all Stacks holdings, and proceeds to issue block commitments
that pay all of the fees to themselves. How can other nodes on the
network detect that this miners commitment transfers are invalid?
The Stacking algorithm addresses this with a two-phase cycle. Before
each reward cycle, Stacks nodes engage in a *prepare* phase, in which
two items are decided:
1. An **anchor block** — the anchor block is a Stacks chain block. For
the duration of the reward cycle, mining any descendant forks of
the anchor block requires transferring mining funds to the
appropriate reward addresses.
2. The **reward set** -- the reward set is the set of Bitcoin
addresses which will receive funds in the reward cycle. This set is
determined using Stacks chain state from the anchor block.
During the reward cycle, miners contend with one another to become the
leader of the next Stacks block by broadcasting *block commitments* on
the Bitcoin chain. These block commitments send Bitcoin funds to
either a burn address or a PoX reward address.
Address validity is determined according to two different rules:
1. If a miner is building off of any chain tip *which is not a
descendant of the anchor block*, all of the miner's commitment
funds must be burnt.
2. If a miner is building off a descendant of the anchor block, the
miner must send commitment funds to 2 addresses from the reward
set, chosen as follows:
* Use the verifiable random function (also used by sortition) to
choose 2 addresses from the reward set. These 2 addresses are
the reward addresses for this block.
* Once addresses have been chosen for a block, these addresses are
removed from the reward set, so that future blocks in the reward
cycle do not repeat the addresses.
Note that the verifiable random function (VRF) used for address
selection ensures that the same addresses are chosen by each miner
selecting reward addresses. If a miner submits a burn commitment which
*does not* send funds to a valid address, those commitments are
ignored by the rest of the network (because other network participants
can deduce that the transfer addresses are invalid).
To reduce the complexity of the consensus algorithm, Stacking reward
cycles are fixed length --- if fewer addresses participate in the
Stacking rewards than there are slots in the cycle, then the remaining
slots are filled with *burn* addresses. Burn addresses are included
in miner commitments at fixed intervals (e.g, if there are 1000 burn
addresses for a reward cycle, then each miner commitment would have
1 burn address as an output).
## Adjusting Reward Threshold Based on Participation
Each reward cycle may transfer miner funds to up to 4000 Bitcoin
addresses (2 addresses in a 2000 burn block cycle). To ensure that
this number of addresses is sufficient to cover the pool of
participants (given 100% participation of liquid STX), the threshold
for participation must be 0.025% (1/4000th) of the liquid supply of
STX. However, if participation is _lower_ than 100%, the reward pool
could admit lower STX holders. The Stacking protocol specifies **2
operating levels**:
* **25%** If fewer than `0.25 * STX_LIQUID_SUPPLY` STX participate in
a reward cycle, participant wallets controlling `x` STX may include
`floor(x / (0.0000625*STX_LIQUID_SUPPLY))` addresses in the reward set.
That is, the minimum participation threshold is 1/16,000th of the liquid
supply.
* **25%-100%** If between `0.25 * STX_LIQUID_SUPPLY` and `1.0 *
STX_LIQUID_SUPPLY` STX participate in a reward cycle, the reward
threshold is optimized in order to maximize the number of slots that
are filled. That is, the minimum threshold `T` for participation will be
roughly 1/4,000th of the participating STX (adjusted in increments
of 10,000 STX). Participant wallets controlling `x` STX may
include `floor(x / T)` addresses in the
reward set.
In the event that a Stacker signals and locks up enough STX to submit
multiple reward addresses, but only submits one reward address, that
reward address will be included in the reward set multiple times.
## Submitting Reward Address and Chain Tip Signaling
Stacking participants must broadcast signed messages for three purposes:
1. Indicating to the network how many STX should be locked up, and for
how many reward cycles.
2. Indicate support for a particular chain tip.
3. Specifying the Bitcoin address for receiving Stacking rewards.
These messages may be broadcast either on the Stacks chain or the
Bitcoin chain. If broadcast on the Stacks chain, these messages must
be confirmed on the Stacks chain _before_ the anchor block for the
reward period. If broadcast on the Bitcoin chain, they may be
broadcast during the prepare phase, but must be included before
the prepare phase finishes.
These signed messages are valid for at most 12 reward cycles (25200 Bitcoin
blocks or ~7 months). If the signed message specifies a lockup period `x` less
than 25200 blocks, then the signed message is only valid for Stacking
participation for `floor(x / 2100)` reward cycles (the minimum participation
length is one cycle: 2100 blocks).
# Anchor Blocks and Reward Consensus
In the **prepare** phase of the Stacking algorithm, miners and network
participants determine the anchor block and the reward set. The
prepare phase is a window `w` of Bitcoin blocks *before* the reward
cycle begins (e.g., the window may be 100 Bitcoin blocks).
At a high-level, nodes determine whether any block was confirmed by
`F*w` blocks during the phase, where `F` is a large fraction (e.g.,
`0.8`). Once the window `w` closes at time `cur`, Stacks nodes find
the potential anchor block as described in the following pseudocode:
```python
def find_anchor_block(cur):
blocks_worked_on = get_all_stacks_blocks_between(cur - w, cur)
# get the highest/latest ancestor before the PREPARE phase for each block worked
# on during the PREPARE phase.
candidate_anchors = {}
for block in blocks_worked_on:
pre_window_ancestor = last_ancestor_of_block_before(block, cur - w)
if pre_window_ancestor is None:
continue
if pre_window_ancestor in candidate_anchors:
candidate_anchors[pre_window_ancestor] += 1
else:
candidate_anchors[pre_window_ancestor] = 1
# if any block is confirmed by at least F*w, then it is the anchor block.
for candidate, confirmed_by_count in candidate_anchors.items():
if confirmed_by_count >= F*w
return candidate
return None
```
Note that there can be at most one anchor block (so long as `F >
0.5`), because:
* Each of the `w` blocks in the prepare phase has at most one
candidate ancestor.
* The total possible number of confirmations for anchor blocks is `w`.
* If any block is confirmed by `>= 0.5*w`, then any other block must
have been confirmed by `< 0.5*w`.
The prepare phase, and the high threshold for `F`, are necessary to
protect the Stacking consensus protocol from damage due to natural
forks, missing block data, and potentially malicious participants. As
proposed, PoX and the Stacking protocol require that Stacks nodes are
able to use the anchor block to determine the *reward set*. If, by
accident or malice, the data associated with the anchor block is
unavailable to nodes, then the Stacking protocol cannot operate
normally — nodes cannot know whether or not a miner is submitting
valid block commitments. A high threshold for `F` ensures that a large
fraction of the Stacks mining power has confirmed the receipt of the
data associated with the anchor block.
## Recovery from Missing Data
In the extreme event that a malicious miner *is* able to get a hidden
or invalid block accepted as an anchor block, Stacks nodes must be
able to continue operation. To do so, Stacks nodes treat missing
anchor block data as if no anchor block was chosen for the reward
cycle — the only valid election commitments will therefore be *burns*
(this is essentially a fallback to PoB). If anchor block data which
was previously missing is revealed to the Stacks node, it must
reprocess all of the leader elections for that anchor block's
associated reward cycle, because there may now be many commitments
which were previously invalid that are now valid.
Reprocessing leader elections is computationally expensive, and
would likely result in a large reorganization of the Stacks
chain. However, such an election reprocessing may only occur once per
reward window (only one valid anchor block may exist for a reward
cycle, whether it was hidden or not). Crucially, intentionally
performing such an attack would require collusion amongst a large
fraction `F` of the Stacks mining power — because such a hidden block
must have been confirmed by `w*F` subsequent blocks. If collusion
amongst such a large fraction of the Stacks mining power is possible,
we contend that the security of the Stacks chain would be compromised
through other means beyond attacking anchor blocks.
## Anchoring with Stacker Support.
The security of anchor block selection is further increased through
Stacker support transactions. In this protocol, when Stacking
participants broadcast their signed participation messages, they
signal support of anchor blocks. This is specified by the chain tip's
hash, and the support signal is valid as long as the message itself is
valid.
This places an additional requirement on anchor block selection. In
addition to an anchor block needing to reach a certain number of miner
confirmations, it must also pass some threshold `t` of valid Stacker
support message signals. This places an additional burden on an anchor
block attack --- not only must the attacker collude amongst a large
fraction of mining power, but they must also collude amongst a
majority of the Stacking participants in their block.
# Stacker Delegation
The process of delegation allows a Stacks wallet address (the
represented address) to designate another address (the delegate
address) for participating in the Stacking protocol. This delegate
address, for as long as the delegation is valid, is able to sign and
broadcast Stacking messages (i.e., messages which lock up Stacks,
designate the Bitcoin reward address, and signal support for chain
tips) on behalf of the represented address. This allows the owner of
the represented address to contribute to the security of the network
by having the delegate address signal support for chain tips. This
combats potential attacks on the blockchain stability by miners that
may attempt to mine hidden forks, hide eventually invalid forks, and
other forms of miner misbehavior.
Supporting delegation adds two new transaction types to the Stacks
blockchain:
* **Delegate Funds.** This transaction initiates a
represented-delegate relationship. It carries the following data:
* Delegate address
* End Block: the Bitcoin block height at which this relationship
terminates, unless a subsequent delegate funds transaction updates
the relationship.
* Delegated Amount: the total amount of STX from this address
that the delegate address will be able to issue Stacking messages
on behalf of.
* Reward Address (_optional_): a Bitcoin address that must be
designated as the funds recipient in the delegates Stacking
messages. If unspecified, the delegate can choose the address.
* **Terminate Delegation.** This transaction terminates a
represented-delegate relationship. It carries the following data:
* Delegate Address
_Note_: There is only ever one active represented-delegate
relationship between a given represented address and delegate address
(i.e., the pair _(represented-address, delegate-address)_ uniquely
identifies a relationship). If a represented-delegate relationship is
still active and the represented address signs and broadcasts a new
"delegate funds" transaction, the information from the new transaction
replaces the prior relationship.
Both types of delegation transactions must be signed by the
represented address. These are transactions on the Stacks blockchain,
and will be implemented via a native smart contract, loaded into the
blockchain during the Stacks 2.0 genesis block. These transactions,
therefore, are `contract-call` invocations. The invoked methods are
guarded by:
```
(asserts! (is-eq contract-caller tx-sender) (err u0))
```
This insures that the methods can only be invoked by direct
transaction execution.
**Evaluating Stacking messages in the context of delegation.** In
order to determine which addresses STX should be locked by a given
Stacking message, the message must include the represented address in
the Stacking message. Therefore, if a single Stacks address is the
delegate for many represented Stacks addresses, the delegate address
must broadcast a Stacking message for each of the represented
addresses.
# Adressing Miner Consolidation in Stacking
PoX when used for Stacking rewards could lead to miner
consolidation. Because miners that _also_ participate as Stackers
could gain an advantage over miners who do not participate as
Stackers, miners would be strongly incentivized to buy Stacks and use
it to crowd out other miners. In the extreme case, this consolidation
could lead to centralization of mining, which would undermine the
decentralization goals of the Stacks blockchain. While we are actively
investigating additional mechanisms to address this potential
consolidation, we propose a time-bounded PoX mechanism and a Stacker-
driven mechanism here.
**Time-Bounded PoX.** Stacking rewards incentivize miner consolidation
if miners obtain _permanent_ advantages for obtaining the new
cryptocurrency. However, by limiting the time period of PoX, this
advantage declines over time. To do this, we define two time periods for Pox:
1. **Initial Phase.** In this phase, Stacking rewards proceed as
described above -- commitment funds are sent to Stacking rewards
addresses, except if a miner is not mining a descendant of the
anchor block, or if the registered reward addresses for a given
reward cycle have all been exhausted. This phase will last for
approximately 2 years (100,000 Bitcoin blocks).
2. **Sunset Phase.** After the initial phase, a _sunset_ block is
determined. This sunset block will be ~8 years (400,000 Bitcoin
blocks) after the sunset phase begins. After the sunset block,
_all_ miner commitments must be burned, rather than transfered to
reward addresses. During the sunset phase, the reward / burn ratio
linearly decreases by `0.25%` (1/400) on each reward cycle, such
that in the 200th reward cycle, the ratio of funds that are
transfered to reward addresses versus burnt must be equal to
`0.5`. For example, if a miner commits 10 BTC, the miner must send
5 BTC to reward addresses and 5 BTC to the burn address.
By time-bounding the PoX mechanism, we allow the Stacking protocol to
use PoX to help bootstrap support for the new blockchain, providing
miners and holders with incentives for participating in the network
early on. Then, as natural use cases for the blockchain develop and
gain steam, the PoX system could gradually scale down.
**Stacker-driven PoX.** To further discourage miners from consolidating,
holders of liquid (i.e. non-Stacked) STX tokens may vote to disable PoX in the next upcoming
reward cycle. This can be done with any amount of STX, and the act of voting
to disable PoX does not lock the tokens.
This allows a community of vigilent
users guard the chain from bad miner behavior arising from consolidation
on a case-by-case basis. Specifically, if a fraction _R_ of liquid STX
tokens vote to disable PoX, it is disabled
only for the next reward cycle. To continuously deactivate PoX, the STX
holders must continuously vote to disable it.
Due to the costs of remaining vigilent, this proposal recomments _R = 0.25_.
At the time of this writing, this is higher than any single STX allocation, but
not so high that large-scale cooperation is needed to stop a mining cartel.
# Bitcoin Wire Formats
Supporting PoX in the Stacks blockchain requires modifications to the
wire format for leader block commitments, and the introduction of new
wire formats for burnchain PoX participation (e.g., performing the STX
lockup on the burnchain).
## Leader Block Commits
For PoX, leader block commitments are similar to PoB block commits: the constraints on the
BTC transaction's inputs are the same, and the `OP_RETURN` output is identical. However,
the _burn output_ is no longer the same. For PoX, the following constraints are applied to
the second through nth outputs:
1. If the block commitment is in a reward cycle, with a chosen anchor block, and this block
commitment builds off a descendant of the PoX anchor block (or the anchor block itself),
then the commitment must use the chosen PoX recipients for the current block.
a. PoX recipients are chosen as described in "Stacking Consensus Algorithm": addresses
are chosen without replacement, by using the previous burn block's sortition hash,
mixed with the previous burn block's burn header hash as the seed for the ChaCha12
pseudorandom function to select M addresses.
b. The leader block commit transaction must use the selected M addresses as outputs [1, M]
That is, the second through (M+1)th output correspond to the select PoX addresses.
The order of these addresses does not matter. Each of these outputs must receive the
same amount of BTC.
c. If the number of remaining addresses in the reward set N is less than M, then the leader
block commit transaction must burn BTC by including (M-N) burn outputs.
2. Otherwise, the second through (M+1)th output must be burn addresses, and the amount burned by
these outputs will be counted as the amount committed to by the block commit.
In addition, during the sunset phase (i.e., between the 100,000th and 500,000th burn block in the chain),
the miner must include a _sunset burn_ output. This is an M+1 indexed output that includes the burn amount
required to fulfill the sunset burn ratio, and must be sent to the burn address:
```
sunset_burn_amount = (total_block_commit_amount) * (reward_cycle_start_height - 100,000) / (400,000)
```
Where `total_block_commit_amount` is equal to the sum of outputs [1, M+1].
After the sunset phase _ends_ (i.e., blocks >= 500,000th burn block), block commits are _only_ burns, with
a single burn output at index 1.
## STX Operations on Bitcoin
As described above, PoX allows stackers to submit `stack-stx`
operations on Bitcoin as well as on the Stacks blockchain. The Stacks
chain also allows addresses to submit STX transfers on the Bitcoin
chain. Such operations are only evaluated by the miner of an anchor block
elected in the burn block that immediately follows the burn block that included the
operations. For example, if a `TransferStxOp` occurs in burnchain block 100, then the
Stacks block elected by burnchain block 101 will process that transfer.
In order to submit on the Bitcoin chain, stackers must submit two Bitcoin transactions:
* `PreStxOp`: this operation prepares the Stacks blockchain node to validate the subsequent
`StackStxOp` or `TransferStxOp`.
* `StackStxOp`: this operation executes the `stack-stx` operation.
* `TransferStxOp`: this operation transfers STX from a sender to a recipient
The wire formats for the above operations are as follows:
### PreStxOp
This operation includes an `OP_RETURN` output for the first Bitcoin output that looks as follows:
```
0 2 3
|------|--|
magic op
```
Where `op = p` (ascii encoded).
Then, the second Bitcoin output _must_ be Stacker address that will be used in a `StackStxOp`. This
address must be a standard address type parseable by the stacks-blockchain node.
### StackStxOp
The first input to the Bitcoin operation _must_ consume a UTXO that is
the second output of a `PreStxOp`. This validates that the `StackStxOp` was signed
by the appropriate Stacker address.
This operation includes an `OP_RETURN` output for the first Bitcoin output:
```
0 2 3 19 20
|------|--|-----------------------------|---------|
magic op uSTX to lock (u128) cycles (u8)
```
Where `op = x` (ascii encoded).
Where the unsigned integer is big-endian encoded.
The second Bitcoin output will be used as the reward address for any stacking rewards.
### TransferStxOp
The first input to the Bitcoin operation _must_ consume a UTXO that is
the second output of a `PreStxOp`. This validates that the `TransferStxOp` was signed
by the appropriate STX address.
This operation includes an `OP_RETURN` output for the first Bitcoin output:
```
0 2 3 19 80
|------|--|-----------------------------|---------|
magic op uSTX to transfer (u128) memo (up to 61 bytes)
```
Where `op = $` (ascii encoded).
Where the unsigned integer is big-endian encoded.
The second Bitcoin output is either a `p2pkh` or `p2sh` output such
that the recipient Stacks address can be derived from the
corresponding 20-byte hash (hash160).
This SIP is now located in the [stacksgov/sips repository](https://github.com/stacksgov/sips/blob/main/sips/sip-007/sip-007-stacking-consensus.md) as part of the [Stacks Community Governance organization](https://github.com/stacksgov).

View File

@@ -1,333 +1,5 @@
# SIP 008 Clarity Parsing and Analysis Cost Assessment
## Preamble
Title: Clarity Parsing and Analysis Cost Assessment
Author: Aaron Blankstein <aaron@blockstack.com>
Status: Draft
Type: Standard
Created: 03/05/2020
License: BSD 2-Clause
# Abstract
This document describes the measured costs and asymptotic costs
assessed for parsing Clarity code into an abstract syntax tree (AST)
and the static analysis of that Clarity code (type-checking and
read-only enforcement). This will not specify the _constants_
associated with those asymptotic cost functions. Those constants will
necessarily be measured via benchmark harnesses and regression
analyses.
# Measurements for Execution Cost
The cost of analyzing Clarity code is measured using the same 5 categories
described in SIP-006 for the measurement of execution costs:
1. Runtime cost: captures the number of cycles that a single
processor would require to process the Clarity block. This is a
_unitless_ metric, so it will not correspond directly to cycles,
but rather is meant to provide a basis for comparison between
different Clarity code blocks.
2. Data write count: captures the number of independent writes
performed on the underlying data store (see SIP-004).
3. Data read count: captures the number of independent reads
performed on the underlying data store.
4. Data write length: the number of bytes written to the underlying
data store.
5. Data read length: the number of bytes read from the underlying
data store.
Importantly, these costs are used to set a _block limit_ for each
block. When it comes to selecting transactions for inclusion in a
block, miners are free to make their own choices based on transaction
fees, however, blocks may not exceed the _block limit_. If they do so,
the block is considered invalid by the network --- none of the block's
transactions will be materialized and the leader forfeits all rewards
from the block.
Costs for static analysis are assessed during the _type check_ pass.
The read-only and trait-checking passes perform work which is strictly
less than the work performed during type checking, and therefore, the
cost assessment can safely fold any costs that would be incurred during
those passes into the type checking pass.
# Common Analysis Metrics and Costs
## AST Parsing
The Clarity parser has a runtime that is linear with respect to the Clarity
program length.
```
a*X+b
```
where a and b are constants, and
X := the program length in bytes
## Dependency cycle detection
Clarity performs cycle detection for intra-contract dependencies (e.g.,
functions that depend on one another). This detection is linear in the
number of dependency edges in the smart contract:
```
a*X+b
```
where a and b are constants, and
X := the total number of dependency edges in the smart contract
Dependency edges are created anytime a top-level definition refers
to another top-level definition.
## Type signature size
Types in Clarity may be described using type signatures. For example,
`(tuple (a int) (b int))` describes a tuple with two keys `a` and `b`
of type `int`. These type descriptions are used by the Clarity analysis
passes to check the type correctness of Clarity code. Clarity type signatures
have varying size, e.g., the signature `int` is smaller than the signature for a
list of integers.
The signature size of a Clarity type is defined as follows:
```
type_signature_size(x) :=
if x =
int => 1
uint => 1
bool => 1
principal => 1
buffer => 2
optional => 1 + type_signature_size(entry_type)
response => 1 + type_signature_size(ok_type) + type_signature_size(err_type)
list => 2 + type_signature_size(entry_type)
tuple => 1 + 2*(count(entries))
+ sum(type_signature_size for each entry)
+ sum(len(key_name) for each entry)
```
## Type annotation
Each node in a Clarity contract's AST is annotated with the type value
for that node during the type checking analysis pass.
The runtime cost of type annotation is:
```
a + b*X
```
where a and b are constants, and X is the type signature size of the
type being annotated.
## Variable lookup
Looking up variables during static analysis incurs a non-constant cost -- the stack
depth _and_ the length of the variable name affect this cost. However,
variable names in Clarity have bounded length -- 128 characters. Therefore,
the cost assessed for variable lookups may safely be constant with respect
to name length.
The stack depth affects the lookup cost because the variable must be
checked for in each context on the stack.
Cost Function:
```
a*X+b*Y+c
```
where a, b, and c are constants,
X := stack depth
Y := the type size of the looked up variable
## Function Lookup
Looking up a function incurs a constant cost with respect
to name length (for the same reason as variable lookup). However,
because functions may only be defined in the top-level contract
context, stack depth does not affect function lookup.
Cost Function:
```
a*X + b
```
where a and b are constants,
X := the sum of the type sizes for the function signature (each argument's type size, as well
as the function's return type)
## Name Binding
The cost of binding a name in Clarity -- in either a local or the contract
context is _constant_ with respect to the length of the name, but linear in
the size of the type signature.
```
binding_cost = a + b*X
```
where a and b are constants, and
X := the size of the bound type signature
## Type check cost
The cost of a static type check is _linear_ in the size of the type signature:
```
type_check_cost(expected, actual) :=
a + b*X
```
where a and b are constants, and
X := `max(type_signature_size(expected), type_signature_size(actual))`
## Function Application
Static analysis of a function application in Clarity requires
type checking the function's expected arguments against the
supplied types.
The cost of applying a function is:
```
a + sum(type_check_cost(expected, actual) for each argument)
```
where a is a constant.
This is also the _entire_ cost of type analysis for most function calls
(e.g., intra-contract function calls, most simple native functions).
## Iterating the AST
Static analysis iterates over the entire program's AST in the type checker,
the trait checker, and in the read-only checker. This cost is assessed
as a constant cost for each node visited in the AST during the type
checking pass.
# Special Function Costs
Some functions require additional work from the static analysis system.
## Functions on sequences (e.g., map, filter, fold)
Functions on sequences need to perform an additional check that the
supplied type is a list or buffer before performing the normal
argument type checking. This cost is assessed as:
```
a
```
where a is a constant.
## Functions on options/responses
Similarly to the functions on sequences, option/response functions
must perform a simple check to see if the supplied input is an option or
response before performing additional argument type checking. This cost is
assessed as:
```
a
```
## Data functions (ft balance checks, nft lookups, map-get?, ...)
Static checks on intra-contract data functions do not require database lookups
(unlike the runtime costs of these functions). Rather, these functions
incur normal type lookup (i.e., fetching the type of an NFT, data map, or data var)
and type checking costs.
## get
Checking a tuple _get_ requires accessing the tuple's signature
for the specific field. This has runtime cost:
```
a*log(N) + b
```
where a and b are constants, and
N := the number of fields in the tuple type
## tuple
Constructing a tuple requires building the tuple's BTree for
accessing fields. This has runtime cost:
```
a*N*log(N) + b
```
where a and b are constants, and
N := the number of fields in the tuple type
## use-trait
Importing a trait imposes two kinds of costs on the analysis.
First, the import requires a database read. Second, the imported
trait is included in the static analysis output -- this increases
the total storage usage and write length of the static analysis.
The costs are defined as:
```
read_count = 1
write_count = 0
runtime = a*X+b
write_length = c*X+d
read_length = c*X+d
```
where a, b, c, and d are constants, and
X := the total type size of the trait (i.e., the sum of the
type sizes of each function signature).
## contract-call?
Checking a contract call requires a database lookup to inspect
the function signature of a prior smart contract.
The costs are defined as:
```
read_count = 1
read_length = a*X+b
runtime = c*X+d
```
where a, b, c, and d are constants, and
X := the total type size of the function signature
## let
Let bindings require the static analysis system to iterate over
each let binding and ensure that they are syntactically correct.
This imposes a runtime cost:
```
a*X + b
```
where a and b are constants, and
X := the number of entries in the let binding.
# SIP-008 Clarity Parsing and Analysis Cost Assessment
This document formerly contained SIP-008 before the Stacks 2.0 mainnet launched.
This SIP is now located in the [stacksgov/sips repository](https://github.com/stacksgov/sips/blob/main/sips/sip-008/sip-008-analysis-cost-assessment.md) as part of the [Stacks Community Governance organization](https://github.com/stacksgov).