LogoLogo
  • Overview
  • Setup and Configuration
  • Other Data Versioning Systems
  • Examples
    • Tutorial
      • Creating Bundles with the Python API
      • Push/Pull using S3
      • Simple Pipeline
      • Run the Pipeline
      • Dockerize a Pipeline
      • Run the Pipeline Container (locally)
      • Run the Pipeline Container (AWS)
    • Examples
      • MNIST and TensorFlow
      • Spacy Task
  • Basic Concepts
    • Bundles
      • Naming
      • Bundle Data Types
      • Tags and Parameters
      • Lineage (or Bundle Metadata)
    • Data Contexts
  • Reference
    • CLI Reference
      • dsdt add
      • dsdt apply
      • dsdt cat
      • dsdt context
      • dsdt commit
      • dsdt dockerize
      • dsdt init
      • dsdt lineage
      • dsdt ls
      • dsdt pull
      • dsdt push
      • dsdt remote
      • dsdt rm
      • dsdt rmr
      • dsdt switch
    • Python API
  • Details
  • Building Pipelines
  • Running Pipelines on AWS
  • Admin
    • Contact / Slack
Powered by GitBook
On this page
  • The Three Names
  • Human query:
  • Pipeline task query:

Was this helpful?

  1. Basic Concepts
  2. Bundles

Naming

PreviousBundlesNextBundle Data Types

Last updated 5 years ago

Was this helpful?

The Three Names

Bundles are given three names on creation. They are:

  • Human name: This is the human-readable name. It stays constant as you make new versions of the data. E.g., you might make new versions of the same logical model, but perhaps with updated training data or model architecture tweaks. In the example below, it is "Model_Ranks".

  • Processing name: This is the machine-readable name. It is the human-readable name + hash(task parameters). It's how a downstream Disdat task knows if a parameterized upstream task has already run! In the example below, the processing name only changes when the parameters change (third invocation of the SelectionTask

  • UUID: This is a globally unique id that changes every time you make a new bundle.

In the example above, we run SelectionTask three times and show how the names change.

  1. First run, bundle has a human name, a processing name (from the date parameter), and a UUID.

  2. Second run, the user has modified their code. Only the UUID changes.

  3. Third run, only the task's parameter changes. Here both the processing name and UUID change.

Human query:

If the user asks for the latest bundle by name they get the last bundle made!

Pipeline task query:

But if a pipeline asks for the lastest with parameter 2019-4-1 they get:

Example of running the SelectionTask three times.