From b99400daf3d032d3a4457943f89e5eae9711f3dd Mon Sep 17 00:00:00 2001 From: Aaron Blankstein Date: Thu, 29 Nov 2018 11:36:55 -0600 Subject: [PATCH] add initial draft of smart contract language sip --- sip/sip-002-smart-contract-language.md | 309 +++++++++++++++++++++++++ 1 file changed, 309 insertions(+) create mode 100644 sip/sip-002-smart-contract-language.md diff --git a/sip/sip-002-smart-contract-language.md b/sip/sip-002-smart-contract-language.md new file mode 100644 index 000000000..ea71b97b2 --- /dev/null +++ b/sip/sip-002-smart-contract-language.md @@ -0,0 +1,309 @@ +# Abstract + +In order to support applications which require validation of some +pieces of their logic, we present a smart contracting language for use +with the Stacks blockchain. This smart contracting language can be +used on the Stacks blockchain to support programatic control over +digital assets within the Stacks blockchain (e.g., BNS names, Stacks +tokens, etc.) + +While application-chains may use any smart-contract language that they +like, this smart contracting language's VM will be a part of +blockstack-core, and, as such, any blockstack-core node will be able to +validate application chains using this smart contracting language with +a simple configuration change. + +This smart contracting language permits static analysis of any legal +smart contract to determine runtime costs. This smart contracting +language is not only Turing-incomplete (a requirement for such static +analysis to be guaranteed successful), but readily permits other kinds +of proofs to be made about the code as well. + +# Design + +A smart contract is composed of two parts: + +1. A data-space, which is a set of tables of data which only the + smart contract may modify +2. A set of transactions which operate within the data-space of the + smart contract, though they may call transactions from other smart + contracts. + +This smart contracting language differs from most other smart +contracting languages in two important ways: + +1. The language _is not_ intended to be compiled. The LISP language + described in this document is the specification for correctness. +2. The language _is not_ Turing complete. This allows us to guarantee + that static analysis of programs to determine properties like + runtime cost and data usage can complete successfully. + +## Specifying Transactions + +A smart contract definition is specified in a LISP language with the +following limitations: + +1. Recursion is illegal and there is no `lambda` function. +2. Looping may only be performed via `map`, `filter`, or `fold` +3. The only atomic types are booleans, integers, fixed length + buffers, and principals +4. There is additional support for lists of the atomic types, + however the only variable length lists in the language appear as + transaction inputs (i.e., there is no support for list operations + like append or join). +5. Variables may only be created via `let` binding and there + is no support for mutating functions like `set`. +6. Defining of constants and functions are allowed for simplifying + code. However, these are purely syntactic. If a definition cannot be + inlined, the contract will be rejected as illegal. +7. Transactions are specified via `define` statement with function + names beginning with `tx-`. Arguments to the function must specify + their types. + +If a transaction returns `true`, then it is considered valid, and any +changes made to the blockchain state will be materialized. If a +transaction returns `false`, the transaction will be considered +invalid, and the transaction will have _no effect_ on the smart +contract's state, except for a transaction fee debit (in the case of +on-chain transactions). + +## Verifying Signing Principles + +The language provides a primitive for checking whether or not the +smart contract transaction was signed by a particular +_principle_. Principles are a representation of a signing entity +(roughly equivalent to a Stacks address). The signature itself is +not checked by the smart contract, but by the VM. To check whether a +given principle has signed the transaction, a transaction may call + +```scheme +(signed-by? principle) +``` + +This returns `true` or `false`. Importantly, to support inter-contract +calls, this function returns `true` if the _outermost_ transaction was +signed by the given principle. A key benefit of the static +analyzability of this smart contracting language is that _all_ +transactions that can possibly be called from the outermost +transaction can be known _a priori_ so that a user can be warned about +all side effects before signing a transaction. + +## Stacks Transaction Primitives + +To interact with Stacks balances, smart contracts may call the +`(stacks-transfer!)` function. This function will attempt to transfer +from a given principle to another principle. This function itself +_requires_ that the operation have been signed by the transfering +principle. The `integer` type in our smart contracting language is +8-bytes, which allows it to specify the maximum amount of microstacks +spendable in a single Stacks transfer. + +Like any other smart contract transaction, this function call returns +true if the transfer was successful, and false otherwise. + +## Data-Space Primitives + +Data within a smart contract's data-space is stored within +`maps`. These stores relate a typed-tuple to another typed-tuple +(almost like a typed key-value store). As opposed to a table data +structure, a map will only associate a given key with exactly one +value. Values in a given mapping are set or fetched using: + +1. `(fetch-entry map-name key-tuple)` - This fetches the value + associated with a given key in the map, or returns `'null` if there + is no such value. +2. `(set-entry! map-name key-tuple value-tuple)` - This will set the + value of `key-tuple` in the data map +3. `(insert-entry! map-name key-tuple value-tuple)` - This will set + the value of `key-tuple` in the data map if and only if an entry + does not already exist. +4. `(delete-entry! map-name key-tuple)` - This will delete `key-tuple` + from the data map + +We chose to use data maps as opposed to other data structures for two +reasons: + +1. The simplicity of data maps allows for both a simple +implementation within the VM, and easier reasoning about +transactions. By inspecting a given transaction definition, it is +clear which maps will be modified and even within those maps, which +keys are affected by a given transaction. +2. The interface of data maps ensures that the return types of map +operations are _fixed length_, which is a requirement for static +analysis of smart contracts' runtime, costs, and other properties. + +A smart contract defines the data schema of a data map with the +`defmap` call. The `defmap` function may only be called in the +top-level of the smart-contract (similar to `define`). This +function accepts a name for the map, and a definition of the structure +of the key and value types. Each of these is a list of `(name, type)` +pairs, and they specify the input and output type of `fetch-entry`. +Types are either the values `'principal`, `'integer`, `'bool` or +the output of a call to `(buffer n)`, which defines an n-byte +fixed-length buffer. + +### Record Type Syntax + +To support the use of _named_ fields in keys and values, our language +allows the construction of named tuples using a function `(tuple ...)`, +e.g., + +``` +(define imaginary-number-a (tuple #real 1 #i 2)) +(define imaginary-number-b (tuple #real 2 #i 3)) + +``` + +This allows for creating named tuples on the fly, which is useful for +data maps where the keys and values are themselves named tuples. To +access a named value of a given tuple, the function `(get #name +tuple)` will return that item from the tuple. +\subsection{Principals for Signature Validation} + +We provide an additional primitive function \texttt{signed-by?} for +performing signature validation. This function returns true if the +given \textit{principal} signed the top-level transaction. A principal +is a string representation of an entity capable of producing a +verifiable signature. For example, in Bitcoin, principals are Bitcoin +addresses, and they could be a RIPEMD-160 hash of a public-key, or a +script-hash. We use the notion of principals to enable support at the +protocol level for many different kinds of signatories. For example, +if we want to continue to support multi-signature principals, the +protocol could support the standard Bitcoin p2sh multi-sig addresses, +and support for this would seamlessly be included in any contract. + +# Static Analysis + +One of the design goals of our smart contracting language was the +ability to statically analyze smart contract transactions to get +accurate upper-bound estimates of transaction costs (i.e., runtime and +storage requirements) as a function of input lengths. By limiting the +types supported, the ability to recurse, and the ability to iterate, +we believe that the language as presented is amenable to such static +analysis based on initial investigations. + +The essential step in demonstrating the possibility of +accurate and useful analysis of our smart contract definitions is +demonstrating that any function within the language specification has +an output length bounded by a constant factor of the input length. If +we can demonstrate this, then statically computing runtime or space +requirements involves merely associating each function in the language +specification with a way to statically determine cost as a function of +input length. + +## Bounding Function Output Length + +Importantly, our smart contracting language does not allow the +creation of variable length lists: there are no `list` or +`cons` constructors, and buffer lengths must be statically +defined. Under such requirements (and given that recursion is +illegal), determining the output lengths of functions is rather +directly achievable. To see this, we'll examine trying to compute the +output lengths for the only functions allowed to iterate in the +language: + +``` +outputLen(map f list) := Len(list) * outputLen(f t) +outputLen(filter f list) := Len(list) +outputLen(fold f list s) := Len(s) +``` + +Many functions within the language will output values larger than the +function's input, _however_, these outputs will be bound by +statically inferable constants. For example, the data function +_fetch-entry_ will always return an object whose size is equal +to the specified value type of the map. + +A complete proof for the static runtime analysis of smart contracts +will be included with the implementation of the language. + +# Deploying the Smart Contract + +Smart contracts on the Stacks blockchain will be deployed directly as +source code. The goal of the smart contracting language is that the +code of the contract defines the _ground truth_ about the intended +functionality of the contract. While seemingly banal, many systems +chose instead to use a a compiler to translate from a a friendly +high-level language to a lower-level language deployed on the +blockchain. Such an architecture is needlessly dangerous. A bug in +such a compiler could lead to a bug in a deployed smart contract when +no such bug exists in the original source. This is problematic for +recovery --- a hard fork to "undo" any should-have-been invalid +transactions would be contentious and potentially create a rift in the +community, especially as it will not be easy to deduce which contracts +exactly were affected and for how long. In contrast, bugs in the VM +itself present a more clear case for a hard fork: the smart contract +was defined correctly, as everyone can see directly on the chain, but +illegal transactions were incorrectly marked as valid. + +# Example: Simple Naming System + +To demonstrate the expressiveness of this smart contracting language, +let's look at an example smart contract which implements a simple +naming system with just two kinds of transactions: _preorder_ +and _register_. The requirements of the system are as follows: + +1. Names may only be owned by one principal +2. A register is only allowed if there is a corresponding preorder + with a matching hash +3. A register transaction must be signed by the same principal who + paid for the preorder +4. A preorder must have paid at least the price of the name. Names + are represented as integers, and any name less than 100000 costs + 1000 microstacks, while all other names cost 100 microstacks. + +In this simple scheme, names are represented by integers, but in +practice, a buffer would probably be used. + +```scheme +(define burn-address '1111111111111111111114oLvT2) +(define (price-function name) + (if (< name 1e5) 1000 100)) + +(defmap name-map + ((name integer)) ((buyer principal))) +(defmap preorder-map + ((name-hash (buffer 160)) (buyer principal)) + ((paid integer))) + +(define (tx-preorder + (buyer-principal principle) + (name-hash (buffer 20)) + (name-price integer)) + (if (stacks-transfer! + buyer-principal name-price burn-address name-price) + (insert-entry! preorder-map + (tuple #name-hash name-hash + #buyer buyer-principal) + (tuple #paid name-price)) + false)) + +(define (tx-register + (buyer-principal principle) + (recipient-principal principle) + (name integer) + (salt integer) + (if (signed-by? buyer-principal) + (let ((preorder-lookup + (tuple #name-hash (hash160 name salt) + #buyer buyer-principal)) + (preorder-entry + (fetch-entry preorder-map preorder-lookup)) + (name-entry + (fetch-entry name-map (tuple #name name)))) + (if (and + ;; must be preordered + (not (eq? preorder-entry) 'null) + ;; name shouldn't *already* exist + (eq? name-entry 'null) + ;; preorder must have paid enough + (<= (price-funcion name) + (get #paid preorder-entry))) + (begin + (delete-entry! preorder-lookup) + (insert-entry! name-table + (tuple #name name) + (tuple #owner recipient))) + false)) + false))) +```