Skip to main content

Command Palette

Search for a command to run...

Kidnapping Machine Learning in Docker Containers

Updated
4 min read
Kidnapping Machine Learning in Docker Containers
A

I am a Student, who finds beauty in simple things. I like to teach sometimes.

Docker has become an essential tool for developers, especially in the field of machine learning. It allows you to create isolated environments, ensuring that your projects run consistently across different systems. In this blog, we'll walk through the process of installing Docker on Linux, creating a Docker container using a Dockerfile, and setting up a machine learning project with pyenv, venv, and docker-compose.

Table of Contents

  1. Installing Docker on Linux

  2. Creating a Dockerfile for Machine Learning

  3. Building and Running the Docker Container

  4. Using Docker Compose for Orchestration

  5. Conclusion


Installing Docker on Linux

Before we start, ensure that your Linux system is up-to-date:

sudo apt update && sudo apt upgrade -y

Step 1: Install Docker

  1. Install Required Packages:

     sudo apt install apt-transport-https ca-certificates curl software-properties-common
    
  2. Add Docker’s Official GPG Key:

     curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
    
  3. Add Docker Repository:

     echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    
  4. Install Docker Engine:

     sudo apt update
     sudo apt install docker-ce docker-ce-cli containerd.io
    
  5. Verify Docker Installation:

     sudo docker --version
    
  6. Manage Docker as a Non-root User:

     sudo usermod -aG docker $USER
     newgrp docker
    

    Now, you can run Docker commands without sudo.


Creating a Dockerfile for Machine Learning

A Dockerfile is a script that contains instructions on how to build a Docker image. For a machine learning project, we’ll install pyenv for Python version management and venv for virtual environments.

Step 2: Create a Dockerfile

  1. Create a Project Directory:

     mkdir ml-project
     cd ml-project
    
  2. Create a Dockerfile:

     touch Dockerfile
    
  3. Edit the Dockerfile:

     # Use an official Python runtime as a parent image
     FROM python:3.9-slim
    
     # Set environment variables
     ENV PYTHONUNBUFFERED=1 \
         PYENV_ROOT=/root/.pyenv \
         PATH="/root/.pyenv/shims:/root/.pyenv/bin:$PATH"
    
     # Install system dependencies
     RUN apt-get update && apt-get install -y \
         build-essential \
         curl \
         git \
         libssl-dev \
         zlib1g-dev \
         libbz2-dev \
         libreadline-dev \
         libsqlite3-dev \
         wget \
         && rm -rf /var/lib/apt/lists/*
    
     # Install pyenv
     RUN curl https://pyenv.run | bash
    
     # Install a specific Python version using pyenv
     RUN pyenv install 3.9.7 && pyenv global 3.9.7
    
     # Create a virtual environment
     RUN python -m venv /opt/venv
     ENV PATH="/opt/venv/bin:$PATH"
    
     # Install Python dependencies
     COPY requirements.txt .
     RUN pip install --upgrade pip && pip install -r requirements.txt
    
     # Set the working directory
     WORKDIR /app
    
     # Copy the current directory contents into the container at /app
     COPY . .
    
     # Command to run on container start
     CMD ["python", "your_script.py"]
    
  4. Create a requirements.txt File:

     touch requirements.txt
    

    Add your Python dependencies to this file, e.g.:

     numpy
     pandas
     scikit-learn
     tensorflow
    

Building and Running the Docker Container

Step 3: Build the Docker Image

  1. Build the Image:

     docker build -t ml-project .
    
  2. Run the Container:

     docker run -it --rm ml-project
    

    This will start the container and run the script specified in the CMD instruction.


Using Docker Compose for Orchestration

Docker Compose is a tool for defining and running multi-container Docker applications. It’s particularly useful for machine learning projects where you might need to run multiple services (e.g., a Jupyter Notebook server, a database, etc.).

Step 4: Create a docker-compose.yml File

  1. Create a docker-compose.yml File:

     touch docker-compose.yml
    
  2. Edit the docker-compose.yml File:

     services:
       ml-service:
         image: ml-project
         build: .
         volumes:
           - .:/app
         ports:
           - "8888:8888"
         command: jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root
    

    This configuration will:

    • Build the Docker image using the Dockerfile.

    • Mount the current directory to /app inside the container.

    • Expose port 8888 for Jupyter Notebook.

  3. Run Docker Compose:

     docker-compose up
    

    You can now access the Jupyter Notebook by navigating to http://localhost:8888 in your browser.


Conclusion

In this blog, we walked through the process of setting up Docker for a machine learning project on Linux. We installed Docker, created a Dockerfile with pyenv and venv, built and ran the container, and used Docker Compose for orchestration. This setup ensures that your machine learning projects are reproducible and can be easily shared with others.

Docker is a powerful tool that can significantly streamline your development workflow, especially in the field of machine learning. By containerizing your projects, you can avoid the common "it works on my machine" problem and focus on building great models.

More from this blog

Aman Pathak

58 posts

Things I would speak if the person in front of me is me