LogoLogo
  • Overview
  • Setup and Configuration
  • Other Data Versioning Systems
  • Examples
    • Tutorial
      • Creating Bundles with the Python API
      • Push/Pull using S3
      • Simple Pipeline
      • Run the Pipeline
      • Dockerize a Pipeline
      • Run the Pipeline Container (locally)
      • Run the Pipeline Container (AWS)
    • Examples
      • MNIST and TensorFlow
      • Spacy Task
  • Basic Concepts
    • Bundles
      • Naming
      • Bundle Data Types
      • Tags and Parameters
      • Lineage (or Bundle Metadata)
    • Data Contexts
  • Reference
    • CLI Reference
      • dsdt add
      • dsdt apply
      • dsdt cat
      • dsdt context
      • dsdt commit
      • dsdt dockerize
      • dsdt init
      • dsdt lineage
      • dsdt ls
      • dsdt pull
      • dsdt push
      • dsdt remote
      • dsdt rm
      • dsdt rmr
      • dsdt switch
    • Python API
  • Details
  • Building Pipelines
  • Running Pipelines on AWS
  • Admin
    • Contact / Slack
Powered by GitBook
On this page
  • This section will teach you:
  • Build the container
  • Check to see if your container is now registered with Docker

Was this helpful?

  1. Examples
  2. Tutorial

Dockerize a Pipeline

Docker files are amazing, and you shouldn't write them.

PreviousRun the PipelineNextRun the Pipeline Container (locally)

Last updated 5 years ago

Was this helpful?

This section will teach you:

  • How to build Docker containers for your pipeline

  • How to specify Python package dependencies (just use setup.py!)

  • How to send the container to AWS ECR (if you've set up your AWS )

The Disdat dockerizer will build a container based on your project's setup.py It will install any Python dependencies it finds in that setup.py file. If you're project can create a source distribution via pip setup.py sdist then you can use Disdat to dockerize your pipeline.

Pro Tip: Does your project depend on packages in your organization's own PyPi server? If so, you'll want to create a pip.conf file and then refer to it in the .

Build the container

  1. Have you installed Docker on your dev box? Do first!

  2. Change into your project's directory.

  3. Run dsdt dockerize <your project's directory>

$dsdt dockerize .
Copying dot file /Users/kyocum/.pip/pip.conf into /var/folders/qv/3rgd_4_569s_8x9xftx96m40zcg1xn/T/tmpx8j7e2hxdockerize
---------- Building base operating system environment
docker build \
		--build-arg KICKSTART_ROOT=/opt/kickstart \
		--build-arg CONDA_VERSION=NO_CONDA \
		--build-arg VIRTUAL_ENV=/opt/python-virtualenv \
		--file /var/folders/qv/3rgd_4_569s_8x9xftx96m40zcg1xn/T/tmpx8j7e2hxdockerize/Dockerfiles/00-disdat-python-3.6.8-slim.dockerfile \
		--tag disdat-python-3.6.8-slim \
		/var/folders/qv/3rgd_4_569s_8x9xftx96m40zcg1xn/T/tmpx8j7e2hxdockerize
Sending build context to Docker daemon  11.94MB
Step 1/11 : FROM python:3.6.8-slim

[ . . . a lot of other output . . . ]

Step 27/28 : ENTRYPOINT [ "/opt/bin/entrypoint.py" ]
 ---> Running in ca2eb595db0f
Removing intermediate container ca2eb595db0f
 ---> 28ffbf8232dd
Step 28/28 : CMD [ "--help" ]
 ---> Running in 3d268a339452
Removing intermediate container 3d268a339452
 ---> 64a7eb830096
Successfully built 64a7eb830096
Successfully tagged disdat-examples:latest
----- Built Docker image for the disdat-examples pipeline on python-3.6.8-slim

Disdat builds one container per git repository. Thus one container can be used to run all the pipelines you define in that repository.

Check to see if your container is now registered with Docker

$docker images
REPOSITORY                        TAG                 IMAGE ID            CREATED             SIZE
disdat-examples                   latest              64a7eb830096        2 minutes ago       763MB
disdat-python-3.6.8-slim-python   latest              79e5b4a09c12        3 minutes ago       687MB
disdat-python-3.6.8-slim          latest              810c82af94c8        3 minutes ago       412MB
python                            3.6.8-slim          73ba0dc9fc6c        7 months ago        138MB

Note: Disdat names of your container based on the name field in your setup.py file.

from setuptools import setup, find_packages


setup(
    name='disdat-examples',
    version='0.0.1rc0',

    packages=find_packages(),
    include_package_data=True,

    install_requires=[
        'disdat>=0.8.16',
        'jupyter',
        'pandas'
    ]
)

Pro Tip: By default we build containers with based on Python 3.6.8-slim. If you're in desperate need of Python 2.7, something has gone wrong with your dev process. If you're in desperate need for Python 3.6.8 +, then you'll need to make a PR to the project for another version of slim that looks like this .

credentials
Disdat configuration file like this
that
file