Data Contexts

Overview

Disdat uses a simple model for sharing versioned bundles among users. It organizes bundles into collections called data contexts. Users create local contexts that contain their own bundles. Users can browse bundle history, add new bundles, and remove bundles. The Disdat pipelining system searches for existing bundles on the local context to avoid re-processing.

However, users may also share data by creating remote contexts. Users can associate a local context with the name of a remote context that many individuals contribute to. Users bind the remote context name to a URL that contains the context’s data, i.e., it is a tuple (<context name>, <s3 path>). Users may create multiple local contexts that each attach to the same remote context.

Commands

There are three commands that take advantage of bound contexts: commit, push, pull. The list below discusses details of how Disdat treats bundles in bound (local contexts with a remote) and unbound contexts (local contexts without a remote).

  • commit <bundle name> -- Tags this bundle to indicate that this bundle should be retained. This tag must be present before you push to a bound context.

  • push <bundle name> -- publish latest committed version of this bundle. Moves all data to remote s3, including linked files.

  • pull -- Synchronize the local context with all bundle metadata available at a bound context

    • Un-localized -- By default, Disdat only pulls the metadata, not your linked files. If you dsdt cat <bundle name> you will find s3://<> paths.

    • Localize -- If you pull with dsdt pull --localize then Disdat will also localize the files in your bundle, pulling them down to your local context.

Note: A push only guarantees that the bundle is now visible, not that it is the “latest” to any users later ‘pull’ request.

Last updated