Steps to set docker environment

3 minute read

Published:

This is a blog post that records the steps to set environment before I successfully trained and run a deep learning model in a docker container. Docker offers an isolated environment for software and dependencies and ensure the application runs quickly and reliably from one computing environment to another, by using container technology. See the difference between docker container and virtual machine in https://www.docker.com/resources/what-container/. Have docker installed on the host computer, build docker environment following steps below.

Create a container and set up environment

Open a terminal on host computer and pull a docker image e.g. ubuntu 18.04.

Create & enter a container

$ docker pull ubuntu:18.04
$ docker images # list all the images
$ docker run -it --shm-size 6G --gpus all --name [container name] -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all [image id]  # create a container with access to nvidia GPU, refer to https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html #setting-up-nvidia-container-toolkit if unable to locate 
$ docker cp [file path] [container id]:/home  # pass files from the host to container
$ docker exec -it [container id] bash # enter the container

Install base utilities

Now you are in a docker container. Time to install software!

$ apt-get update 
$ apt-get install -y build-essential
$ apt-get install -y wget
$ apt-get clean

Install miniconda

In case that you need run projects in different python version, create a conda environment.

$ cd home 
$ mkdir download && cd download
$ wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ bash Miniconda3-latest-Linux-x86_64.sh  
$ export PATH=$PATH:/root/anaconda3/bin

Install cuda

Install cuda if you use Nvidia card. Choose the right cuda version.

$ cd home/download
$ wget https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda_11.3.0_465.19.01_linux.run
$ sh cuda_11.3.0_465.19.01_linux.run # install the dependencies if needed, and choose to install cuda-11.3 only is enough (do not install GPU driver)

Export paths to your current environment. Here I use vim to edit ~/.bashrc file, add three lines, save changes and exit vim editor.

$ apt-get install vim 
$ vi ~/.bashrc # add the following 3 lines to .bashrc 
$ export LD_LABRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.3/lib64
$ export PATH=$PATH:/usr/local/cuda-11.3/bin
$ export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-11.3

Don’t forget to source ~/.bashrc and the check the cuda installation.

$ source  ~/.bashrc
$ nvcc -V #check the cuda version

Now you can further set the container according to your project requirements.

Trouble shooting

Docker images and containers are saved in /var/lib/docker by default, usually the space is not enough.

One way to change the files’ location:

Check the size of each image and container

$ docker system df -v

Stop docker service and copy docker files to the new directory (e.g. /home/docker/lib/)

$ systemctl stop docker 
$ rsync -r -avz /var/lib/docker /home/docker/lib/
$ mv /var/lib/docker /var/lib/docker-old
$ ln -s /home/docker/lib/docker /var/lib/
$ systemctl start docker  # check images and containers, delete /var/lib/docker-old if everything is OK

Runtime error: dataloader’s workers are out of shared memory.

$ df -h | grep shm  # check the shared memory
$ service docker stop
$ cd /var/lib/docker/containers/[container id]
$ vim hostconfig.json # set shm-size to a bigger number
$ service docker start

Not found containers

$ #try to link the directories again
$ ln -s /home/docker/lib/docker /var/lib/
$ sudo rm -rf /var/lib/docker

Summary

This is the basic pipeline to makes things happen in docker container. Find more advanced operations such as dockerfile usage, root permission settings on your own if needed. Things should be easier since you have the initial momentum. Cheers!