Kidnapping Machine Learning in Docker Containers

I am a Student, who finds beauty in simple things. I like to teach sometimes.
Docker has become an essential tool for developers, especially in the field of machine learning. It allows you to create isolated environments, ensuring that your projects run consistently across different systems. In this blog, we'll walk through the process of installing Docker on Linux, creating a Docker container using a Dockerfile, and setting up a machine learning project with pyenv, venv, and docker-compose.
Table of Contents
Installing Docker on Linux
Before we start, ensure that your Linux system is up-to-date:
sudo apt update && sudo apt upgrade -y
Step 1: Install Docker
Install Required Packages:
sudo apt install apt-transport-https ca-certificates curl software-properties-commonAdd Docker’s Official GPG Key:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpgAdd Docker Repository:
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/nullInstall Docker Engine:
sudo apt update sudo apt install docker-ce docker-ce-cli containerd.ioVerify Docker Installation:
sudo docker --versionManage Docker as a Non-root User:
sudo usermod -aG docker $USER newgrp dockerNow, you can run Docker commands without
sudo.
Creating a Dockerfile for Machine Learning
A Dockerfile is a script that contains instructions on how to build a Docker image. For a machine learning project, we’ll install pyenv for Python version management and venv for virtual environments.
Step 2: Create a Dockerfile
Create a Project Directory:
mkdir ml-project cd ml-projectCreate a
Dockerfile:touch DockerfileEdit the
Dockerfile:# Use an official Python runtime as a parent image FROM python:3.9-slim # Set environment variables ENV PYTHONUNBUFFERED=1 \ PYENV_ROOT=/root/.pyenv \ PATH="/root/.pyenv/shims:/root/.pyenv/bin:$PATH" # Install system dependencies RUN apt-get update && apt-get install -y \ build-essential \ curl \ git \ libssl-dev \ zlib1g-dev \ libbz2-dev \ libreadline-dev \ libsqlite3-dev \ wget \ && rm -rf /var/lib/apt/lists/* # Install pyenv RUN curl https://pyenv.run | bash # Install a specific Python version using pyenv RUN pyenv install 3.9.7 && pyenv global 3.9.7 # Create a virtual environment RUN python -m venv /opt/venv ENV PATH="/opt/venv/bin:$PATH" # Install Python dependencies COPY requirements.txt . RUN pip install --upgrade pip && pip install -r requirements.txt # Set the working directory WORKDIR /app # Copy the current directory contents into the container at /app COPY . . # Command to run on container start CMD ["python", "your_script.py"]Create a
requirements.txtFile:touch requirements.txtAdd your Python dependencies to this file, e.g.:
numpy pandas scikit-learn tensorflow
Building and Running the Docker Container
Step 3: Build the Docker Image
Build the Image:
docker build -t ml-project .Run the Container:
docker run -it --rm ml-projectThis will start the container and run the script specified in the
CMDinstruction.
Using Docker Compose for Orchestration
Docker Compose is a tool for defining and running multi-container Docker applications. It’s particularly useful for machine learning projects where you might need to run multiple services (e.g., a Jupyter Notebook server, a database, etc.).
Step 4: Create a docker-compose.yml File
Create a
docker-compose.ymlFile:touch docker-compose.ymlEdit the
docker-compose.ymlFile:services: ml-service: image: ml-project build: . volumes: - .:/app ports: - "8888:8888" command: jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-rootThis configuration will:
Build the Docker image using the
Dockerfile.Mount the current directory to
/appinside the container.Expose port 8888 for Jupyter Notebook.
Run Docker Compose:
docker-compose upYou can now access the Jupyter Notebook by navigating to
http://localhost:8888in your browser.
Conclusion
In this blog, we walked through the process of setting up Docker for a machine learning project on Linux. We installed Docker, created a Dockerfile with pyenv and venv, built and ran the container, and used Docker Compose for orchestration. This setup ensures that your machine learning projects are reproducible and can be easily shared with others.
Docker is a powerful tool that can significantly streamline your development workflow, especially in the field of machine learning. By containerizing your projects, you can avoid the common "it works on my machine" problem and focus on building great models.