Deep Learning Dev Box How-To


Overview

The procedure below will setup a Ubuntu Linux box with a NVIDIA GeForce GTX 1080 to do machine (deep) learning development using Jupyter Notebooks.

Hardware

The following hardware is used:
Processor: Intel i5 3.2GHz Quad Core
Memory: 16GB DDR3
Hard Drive: 500GB Samsung EVO SSD
GPU: MSI NVIDIA GeForce GTX 1080 DUKE 8GB
Power Supply: Thermaltake 650W
Case: Thermaltake Versa J24

Software

The following software is used:
Ubuntu 16.04
NVIDIA GPU Driver 4.10
NVIDIA CUDA Version 9.0
NVIDIA CuDNN Version 7.4
Python Version 3.5.6
Anaconda3 Version 5.3.1 x86_64 (including Jupyter Notebook)
Python packages: tensorflow-gpu (V1.12.0), scikit-image, kaggle, keras, pandas, sklearn

Installation Procedures

The installation procedures below assume your download directory is ~/Download, you store your Bash aliases in ~/.bash_aliases, and your aliases are loaded by ~/.bashrc.

1. Install NVIDIA GPU Driver 410.

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-410
sudo reboot

2. Install NVIDIA CUDA Version 9.0.

download: https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=deblocal
cd ~/Download
sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-9-0
add to ~/.bash_aliases: PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
source ~/.bash_aliases
nvcc --version
nvidia-smi
sudo reboot

3. Install NVIDIA CUDA Version 9.0 Patches 1-4.

download: https://developer.nvidia.com/compute/cuda/9.0/Prod/patches/1/cuda-repo-ubuntu1604-9-0-local-cublas-performance-update_1.0-1_amd64-deb
download: https://developer.nvidia.com/compute/cuda/9.0/Prod/patches/2/cuda-repo-ubuntu1604-9-0-local-cublas-performance-update-2_1.0-1_amd64-deb
download: https://developer.nvidia.com/compute/cuda/9.0/Prod/patches/3/cuda-repo-ubuntu1604-9-0-local-cublas-performance-update-3_1.0-1_amd64-deb
download: https://developer.nvidia.com/compute/cuda/9.0/Prod/patches/4/cuda-repo-ubuntu1604-9-0-176-local-patch-4_1.0-1_amd64-deb
perform the following for each patch:
  sudo dpkg -i cuda-repo-ubuntu1604-9-0-local-cublas-performance-update_1.0-1_amd64.deb
  sudo apt-get update
  sudo apt-get upgrade
note: may have to only do it for the first one, that may handle applying the subsequent patches.

4. Install NVIDIA CuDNN.

download: https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.4.1.5/prod/9.0_20181108/Ubuntu16_04-x64/libcudnn7_7.4.1.5-1%2Bcuda9.0_amd64.deb
sudo dpkg -i libcudnn7_7.4.1.5-1+cuda9.0_amd64.deb
add to ~/.bash_aliases:
  CUDA_HOME=${CUDA_HOME}:/usr/local/cuda:/usr/local/cuda-9.0
  LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.0/lib64
  LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64
sudo reboot

5. Install Python, Pip, and Virtualenv.

sudo apt update
sudo apt install python3-dev python3-pip
sudo pip3 install -U virtualenv

6. Create virtual environment. Python virtual environments are used to isolate package installation from the system. 

virtualenv --system-site-packages -p python3 ./venv
source ./venv/bin/activate
pip install --upgrade pip
pip list
deactivate # this removes the virtual environment.

7. Install tensorflow-gpu Python package.

pip install --upgrade tensorflow-gpu
python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"

8. Install Anaconda3. This is optional but worth it to use Jupyter Notebooks. In doing so, we won't use the virtual environment above.

deactivate # don't use virtual environment.
download: https://repo.anaconda.com/archive/Anaconda3-5.3.1-Linux-x86_64.sh
bash ~/Downloads/Anaconda3-5.3.1-Linux-x86_64.sh
source ~/.bashrc
conda install python=3.5
now rerun the commands in step 6 and 7 above to verify tensorflow-gpu still works.
pip install --upgrade pip
pip install --upgrade tensorflow-gpu
python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
anaconda-navigator

9. Install Additional Python packages.

pip install scikit-image
pip install kaggle
pip install keras
pip install pandas
pip install sklearn

10. All of the required software has been installed.

You can now run a Jupyter Notebook by opening a terminal and typing anaconda-navigator, then clicking "Jupyter Notebook". 
Now go to kaggle.com, setup an account, create an API token (key), download it to ~/.kaggle, and join a competition!