Tutorial
From bundles, to remote S3, to pipelines, to execution
This first step of the tutorial will teach you about:
How to create a local data context to store bundles
Use the CLI to create a simple bundle that contains a single file
Make multiple versions of the bundle and inspect the bundle using the CLI
Create a bundle using the CLI
Here we create a bundle and inspect it. I assume you followed the instructions in the overview and you have Disdat installed and initialized!
Create a new local data context.
$dsdt context examples
Switch into that local data context.
$dsdt switch examples
The commands dsdt context
and dsdt switch
are kind of like git branch
and git checkout
. However, we use different terms because contexts don't behave like code repositories. The last command dsdt context
shows you all the local contexts you have on your machine.
$ dsdt context examples
Disdat created data context None/examples at object dir /Users/kyocum/.disdat/context/examples/objects.
$ dsdt switch examples
Switched to context examples
$ dsdt context
* examples [None@None]
Now let's add some data. Disdat wraps up collections of literals and files into a bundle. Using the CLI , you can make bundles from files or directories. We'll refer to README.md
but you can choose any file you wish.
Create a bundle called
my.bundle
that will contain the fileREADME.md
and add it to the local context.List out all the bundles in our local context.
cat the bundle to show its contents.
$ dsdt add my.bundle README.md
$ dsdt ls -v
NAME PROC_NAME OWNER DATE COMMITTED UUID TAGS
my.bundle BundleWrapperTask_my_bundle____e1908ea6e8 kyocum 01-12-20 21:30:45 False 386f13cf-5b51-4237-b649-8549eff30004
$ dsdt cat my.bundle
/Users/kyocum/.disdat/context/examples/objects/386f13cf-5b51-4237-b649-8549eff30004/README.md
Great! You've created bundle that just contains one file, README.md
. Now lets make another version with the same name:
$ dsdt add my.bundle README.md
$ dsdt ls -v
NAME PROC_NAME OWNER DATE COMMITTED UUID TAGS
my.bundle BundleWrapperTask_my_bundle____e1908ea6e8 kyocum 01-12-20 21:36:42 False c1f9085f-8bb5-4417-8b65-804a1ae7e451
my.bundle BundleWrapperTask_my_bundle____e1908ea6e8 kyocum 01-12-20 21:30:45 False 386f13cf-5b51-4237-b649-8549eff30004
Now you have two versions of the same data. They share the same NAME
, so any time you ask Disdat for my.bundle
you will always get the most recent (unless you ask for it by PROC_NAME
or UUID
as well like dsdt cat -u c1f9085f-8bb5-4417-8b65-804a1ae7e451
)
Congrats! You've created your first data context and bundle. In the rest of the tutorial we'll look at how how you can push/pull your bundles to/from AWS S3 to share data with colleagues and as inputs/outputs from pipelines.
Last updated
Was this helpful?