LogoLogo
  • Overview
  • Setup and Configuration
  • Other Data Versioning Systems
  • Examples
    • Tutorial
      • Creating Bundles with the Python API
      • Push/Pull using S3
      • Simple Pipeline
      • Run the Pipeline
      • Dockerize a Pipeline
      • Run the Pipeline Container (locally)
      • Run the Pipeline Container (AWS)
    • Examples
      • MNIST and TensorFlow
      • Spacy Task
  • Basic Concepts
    • Bundles
      • Naming
      • Bundle Data Types
      • Tags and Parameters
      • Lineage (or Bundle Metadata)
    • Data Contexts
  • Reference
    • CLI Reference
      • dsdt add
      • dsdt apply
      • dsdt cat
      • dsdt context
      • dsdt commit
      • dsdt dockerize
      • dsdt init
      • dsdt lineage
      • dsdt ls
      • dsdt pull
      • dsdt push
      • dsdt remote
      • dsdt rm
      • dsdt rmr
      • dsdt switch
    • Python API
  • Details
  • Building Pipelines
  • Running Pipelines on AWS
  • Admin
    • Contact / Slack
Powered by GitBook
On this page
  • This section will teach you:
  • Local container execution in your local context
  • Forcing Disdat to recompute tasks

Was this helpful?

  1. Examples
  2. Tutorial

Run the Pipeline Container (locally)

Deploy to the cloud; debug locally!

This section will teach you:

  • How to run a container locally (for debugging!)

  • How to run your container on AWS Batch (for fun!)

  • Watch Disdat re-use prior results, and learn how to force it to recompute tasks.

Local container execution in your local context

The Disdat CLI provides a run command that almost exactly mirrors apply.

$dsdt run . pipelines.dependent_tasks.B

Note: dsdt run requires the directory in which your setup.py exists.

dsdt run <directory of setup.py> <pipeline> <optional pipeline args>

By default the run command:

  • Runs your container via the local Docker client

  • Runs in your current context (placing output bundles in that context)

  • The current implementation waits till the container exits to return the result

Did your output contain:

===== Luigi Execution Summary =====

Scheduled 1 tasks of which:
* 1 complete ones were encountered:
    - 1 DriverTask(...)

Did not run any tasks

That's weird! It didn't seem to run our tasks (A or B). That's because Disdat noticed you might have already run this job and found the prior result bundle.

Pro Tip: If you want to see what you're container is doing, you can follow along with this command in another terminal:

$docker logs -f <container name>

Forcing Disdat to recompute tasks

Sometimes you want to re-compute a particular task or a task and all its upstream dependencies. To do so you can use -f to recompute the last task or --force-all to recompute the whole pipeline (careful with that). Those flags apply to both run and to apply

$dsdt run -f . pipelines.dependent_tasks.B

That will ensure we make another version of the output bundle.

PreviousDockerize a PipelineNextRun the Pipeline Container (AWS)

Last updated 5 years ago

Was this helpful?