From 4121eee2a3733ba407c5b460016bfb1fca38114f Mon Sep 17 00:00:00 2001 From: Jude Nelson Date: Tue, 30 May 2017 19:19:36 -0400 Subject: [PATCH] elaborate on how blockstack storage drivers are expected to behave regarding loading data from other users' storage --- blockstack_client/backend/drivers/_skel.py | 73 ++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff --git a/blockstack_client/backend/drivers/_skel.py b/blockstack_client/backend/drivers/_skel.py index defe768f3..c0e872753 100644 --- a/blockstack_client/backend/drivers/_skel.py +++ b/blockstack_client/backend/drivers/_skel.py @@ -22,9 +22,38 @@ """ """ +Overview +======== + This is a skeleton no-op driver, meant for tutorial purposes. It will be dynamically imported by blockstack. +At the driver level, Blockstack expects a key/value store. If a user +does a `put`, then the data stored should be readable by any other +user that does a `get` on the same key. Blockstack itself chooses what +the keys are; they are *not* derived from the data. + +To see what is expected, consider the following example: + + Suppose Alice and Bob both use a Blockstack-powered blogging application. + When Alice writes a new blog post, the blogging application asks + Blockstack to save it. The app gives the blogpost the application-chosen name + "alice_2017-05-30-15:05:30", and passes both the name and data into + Blockstack. Blockstack calls into its storage drivers and saves the + data to each underlying storage service. + + When Bob goes to read Alice's blog, his client discovers that the + new blog post is called "alice_2017-05-30-15:05:30". His client + then asks Blockstack to load up the blogpost's contents. Bob's + storage drivers use the name "alice_2017-05-30-15:05:30" to look + up and fetch the blog data from each service. + + Later, Alice decides to delete "alice_2017-05-30-15:05:30". When + Bob goes to read Alice's blog, his client again fetches the blogpost + titled "alice_2017-05-30-15:05:30". Since Alice has removed the + data from her storage providers, none of Bob's drivers return the + blogpost data. + Background ========== @@ -86,6 +115,44 @@ is because more sophisticated strategies can be implemented as "meta drivers" th existing drivers as modules, and forward `get` and `put` requests to them according to the desired strategy. +Access Strategy +=============== + +It is up to the storage drivers to not only store the data given +to them, but also to store any metadata required to later translate +the app-given name back into the data that was previously stored. +Moreover, once data is stored in Blockstack, *any* user with the +data's name should be able to read it. + +Some storage systems make this easy. For example, the `disk` and `s3` +drivers achieve this simply by storing the data under the name given +by the application. Using the example in the Overview section, the +blogpost data for "alice_2017-05-30-15:05:30" can simply be stored as +a file or object with the name "alice_2017-05-30-15:05:30". + +This is less easy for storage systems like Dropbox, where the storage +system creates its own URI for each piece of data stored. In these cases, +the driver must build and maintain an index over all of the data stored, +so it can later translate the app-given name (i.e. "alice_2017-05-30-15:05:30") +back into the service-given URI (i.e. "https://www.dropbox.com/s/pa4lugfa8yiuoio/profile.json?dl=1") +on `get`. + +For indexing, driver developers are encouraged to use the following methods +from `common.py` to build a co-located index: + + * `get_indexed_data()`: loads data from the storage by translating an + app-given name into a service-specific URI. + * `put_indexed_data()`: stores data with a given name into the storage + system, and inserts an entry for it in an index alongside the data. + * `delete_indexed_data()`: removes data with a given name from the storage + system, and updates the co-located index to remove its name-to-URI link. + * `index_setup()`: instantiates an index (callable from the driver's + `storage_init()` method). + * `driver_config()`: sets up callbacks to be used by the indexer code + for loading and storing both data and pages of the index. + +Please see the docstrings for each of these methods in the `common.py` file. + Responsibilities ================ @@ -134,6 +201,12 @@ are divided as follows: Blockstack sends and receives, and deduce things like the user's network location and the application being used. If behavior confidentiality is required, then the driver must take additional steps to implement it. + + * Optimizations. Things like write-batching, caching, write-deferrals, and + so on are handled by Blockstack. The driver should operate synchronously on + both gets and puts. Specifically, the driver should NOT attempt to cache + reads, and the driver should NOT return from a put until the data is guaranteed + to be durable. """