LogoLogo
  • Overview
  • Setup and Configuration
  • Other Data Versioning Systems
  • Examples
    • Tutorial
      • Creating Bundles with the Python API
      • Push/Pull using S3
      • Simple Pipeline
      • Run the Pipeline
      • Dockerize a Pipeline
      • Run the Pipeline Container (locally)
      • Run the Pipeline Container (AWS)
    • Examples
      • MNIST and TensorFlow
      • Spacy Task
  • Basic Concepts
    • Bundles
      • Naming
      • Bundle Data Types
      • Tags and Parameters
      • Lineage (or Bundle Metadata)
    • Data Contexts
  • Reference
    • CLI Reference
      • dsdt add
      • dsdt apply
      • dsdt cat
      • dsdt context
      • dsdt commit
      • dsdt dockerize
      • dsdt init
      • dsdt lineage
      • dsdt ls
      • dsdt pull
      • dsdt push
      • dsdt remote
      • dsdt rm
      • dsdt rmr
      • dsdt switch
    • Python API
  • Details
  • Building Pipelines
  • Running Pipelines on AWS
  • Admin
    • Contact / Slack
Powered by GitBook
On this page
  • This section will teach you:
  • Push container to ECR
  • Submit the job to batch
  • Let's grab our results from the remote context
  • Continuing on!

Was this helpful?

  1. Examples
  2. Tutorial

Run the Pipeline Container (AWS)

PreviousRun the Pipeline Container (locally)NextExamples

Last updated 5 years ago

Was this helpful?

This section will teach you:

  • Configure Disdat with an ECR prefix and the name of your AWS Batch job queue

  • How to execute your container on Batch

  • How to look at your resulting bundles

Have you setup your AWS ? You will also need to . That means creating 1.) a and, 2.) a batch.

You then need to add your AWS Batch job queue name to the Disdat .

We're still assuming your running in (switched into) your examples context that has a remote attached:

$dsdt context
*	examples	[examples@s3://disdat-prod/context]

Push container to ECR

Let's push our container up to AWS ECR. We use the same dockerize command. We say --no-build because you built it in the prior .

$dsdt dockerize --no-build --push .

You should see a bunch of transfer status updates as Docker moves the container to ECR.

Submit the job to batch

$dsdt run --backend AWSBatch  . pipelines.dependent_tasks.B
Re-using prior AWS Batch run job definition : {'jobDefinitionName': 'kyocum-disdat-examples-job-definition', 'jobDefinitionArn': 'arn:aws:batch:us-west-2:48135127292:job-definition/kyocum-disdat-examples-job-definition:1', 'revision': 1, 'status': 'ACTIVE', 'type': 'container', 'parameters': {}, 'containerProperties': {'image': '41235552.dkr.ecr.us-west-2.amazonaws.com/kyocum/test/disdat-examples', 'vcpus': 2, 'memory': 4000, 'command': [], 'volumes': [], 'environment': [], 'mountPoints': [], 'ulimits': [], 'resourceRequirements': []}}
Job disdat-examples-1579040780 (ID 08c213a6-9fb2-468c-99c1-a72f52534dcd) with definition kyocum-disdat-examples-job-definition:1 submitted to AWS Batch queue disdat-batch-queue

If you log in to your AWS account you should see something like:

Once the job moves to the SUCCEEDED state, then the container has run successfully. If it didn't you can click on the job and follow the clicks through to its CloudWatch logs. Those look like:

Let's grab our results from the remote context

$dsdt pull
Fast Pull synchronizing with remote context test@s3://disdat-prod-111461292-us-west-2/beta/context
Fast pull fetching 4 objects...
Fast pull complete -- thread pool closed and joined.
$dsdt ls -v
NAME                	PROC_NAME           	OWNER   	DATE              	COMMITTED	UUID                                    	TAGS
b                   	B__99914b932b       	root    	01-14-20 14:33:29 	True    	a0ae2be4-d496-4b0f-94ac-6a53d612c0ad
a                   	A__99914b932b       	root    	01-14-20 14:33:29 	True    	4d5a1f57-30fa-4c1a-b2fe-8b23fceb7103

Yay! We pushed a container, ran it up on AWS, and then grabbed all of our results!

Continuing on!

There's a lot more to know about how Disdat manages bundles when it runs pipelines remotely. And how to control the size of the container instance of your job.

  • I need to build containers with unix*, R, or other dependencies

I need to know more details about (using dependencies, return types, etc.)

I need more information on

credentials
stand up AWS Batch
compute environment
job submission queue
configuration file
step
building pipelines
running in AWS
There's now a job in the RUNNABLE state
Successful run of our example task on AWS Batch.