Run the Pipeline Container (locally)

Deploy to the cloud; debug locally!

This section will teach you:

  • How to run a container locally (for debugging!)

  • How to run your container on AWS Batch (for fun!)

  • Watch Disdat re-use prior results, and learn how to force it to recompute tasks.

Local container execution in your local context

The Disdat CLI provides a run command that almost exactly mirrors apply.

$dsdt run . pipelines.dependent_tasks.B

Note: dsdt run requires the directory in which your setup.py exists.

dsdt run <directory of setup.py> <pipeline> <optional pipeline args>

By default the run command:

  • Runs your container via the local Docker client

  • Runs in your current context (placing output bundles in that context)

  • The current implementation waits till the container exits to return the result

Did your output contain:

===== Luigi Execution Summary =====

Scheduled 1 tasks of which:
* 1 complete ones were encountered:
    - 1 DriverTask(...)

Did not run any tasks

That's weird! It didn't seem to run our tasks (A or B). That's because Disdat noticed you might have already run this job and found the prior result bundle.

Pro Tip: If you want to see what you're container is doing, you can follow along with this command in another terminal:

$docker logs -f <container name>

Forcing Disdat to recompute tasks

Sometimes you want to re-compute a particular task or a task and all its upstream dependencies. To do so you can use -f to recompute the last task or --force-all to recompute the whole pipeline (careful with that). Those flags apply to both run and to apply

$dsdt run -f . pipelines.dependent_tasks.B

That will ensure we make another version of the output bundle.

Last updated