Run the Pipeline Container (AWS)
This section will teach you:
Configure Disdat with an ECR prefix and the name of your AWS Batch job queue
How to execute your container on Batch
How to look at your resulting bundles
Have you setup your AWS credentials? You will also need to stand up AWS Batch. That means creating 1.) a compute environment and, 2.) a batch job submission queue.
You then need to add your AWS Batch job queue name to the Disdat configuration file.
We're still assuming your running in (switched into) your examples context that has a remote attached:
$dsdt context
* examples [examples@s3://disdat-prod/context]
Push container to ECR
Let's push our container up to AWS ECR. We use the same dockerize command. We say --no-build
because you built it in the prior step.
$dsdt dockerize --no-build --push .
You should see a bunch of transfer status updates as Docker moves the container to ECR.
Submit the job to batch
$dsdt run --backend AWSBatch . pipelines.dependent_tasks.B
Re-using prior AWS Batch run job definition : {'jobDefinitionName': 'kyocum-disdat-examples-job-definition', 'jobDefinitionArn': 'arn:aws:batch:us-west-2:48135127292:job-definition/kyocum-disdat-examples-job-definition:1', 'revision': 1, 'status': 'ACTIVE', 'type': 'container', 'parameters': {}, 'containerProperties': {'image': '41235552.dkr.ecr.us-west-2.amazonaws.com/kyocum/test/disdat-examples', 'vcpus': 2, 'memory': 4000, 'command': [], 'volumes': [], 'environment': [], 'mountPoints': [], 'ulimits': [], 'resourceRequirements': []}}
Job disdat-examples-1579040780 (ID 08c213a6-9fb2-468c-99c1-a72f52534dcd) with definition kyocum-disdat-examples-job-definition:1 submitted to AWS Batch queue disdat-batch-queue
If you log in to your AWS account you should see something like:

Once the job moves to the SUCCEEDED state, then the container has run successfully. If it didn't you can click on the job and follow the clicks through to its CloudWatch logs. Those look like:

Let's grab our results from the remote context
$dsdt pull
Fast Pull synchronizing with remote context test@s3://disdat-prod-111461292-us-west-2/beta/context
Fast pull fetching 4 objects...
Fast pull complete -- thread pool closed and joined.
$dsdt ls -v
NAME PROC_NAME OWNER DATE COMMITTED UUID TAGS
b B__99914b932b root 01-14-20 14:33:29 True a0ae2be4-d496-4b0f-94ac-6a53d612c0ad
a A__99914b932b root 01-14-20 14:33:29 True 4d5a1f57-30fa-4c1a-b2fe-8b23fceb7103
Yay! We pushed a container, ran it up on AWS, and then grabbed all of our results!
Continuing on!
There's a lot more to know about how Disdat manages bundles when it runs pipelines remotely. And how to control the size of the container instance of your job.
I need to know more details about building pipelines (using dependencies, return types, etc.)
I need more information on running in AWS
I need to build containers with unix*, R, or other dependencies
Last updated
Was this helpful?