I had a few issues getting my system set up to run TensorFlow (e.g., via Keras) whilst utilising the discrete GPU. Not all the instructions (see links below) I found got me all the way to a working machine, hence summarising the process here.

To start with, a brief overview of the machine I am working on, as there may be subleties affecting any individual configuration. As this is my work laptop, I couldn’t use Ubuntu as intended (accessing the GPU through virtualisation tools such as VirtualBox is not worth the effort) - instead it uses 64bit Windows 7 Enterprise (yay…). Nevertheless, as it is a Lenovo P50 workstation with an Intel i7-6820HQ CPU (quad-core @ 2.7GHz), 32 GB RAM and, most importantly, a Quadro M2000M GPU from NVIDIA with 4GB dedicated memory, it has plenty of power.

To get everything running we need to install MS Visual Studio, the CUDA libraries from NVIDIA and of course Python and the corresponding neural network libaries.

MS Visual Studio

For all the relevant bindings to work, we need to install Visual Studio. The best solution is to install the free 2013 community edition (as CUDA 8.0 will not work with 2015; I haven’t checked the new releases, but as we are not working in VS anyway, it’s not really relevant). Go to https://www.visualstudio.com/en-us/news/releasenotes/vs2013-update5-vs and sign up for the free Dev Essentials subscription to download VS.

Once that’s done, we should add the path to VS’ bin folder to the system PATH, e.g. C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin.

CUDA

Next, we can install the CUDA Toolkit from https://developer.nvidia.com/cuda-downloads. During the installation it will look for VS. Once finished we can make sure the installation was successful by running some of the samples from C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\Samples_vs2013.sln. To do this, right-click on a sample in the Solution Explorer and select Debug > Start new instance.

The next (optional) step is to install the NVIDIA CUDA® Deep Neural Network library (cuDNN) library from https://developer.nvidia.com/cudnn. Set up an account to gain access and download the version 5.1 files. Tensorflow doesn’t yet support the newer 6.0 version.

These files can be extracted anywhere on your system (no installation). Just add the location to the system PATH variable.

TensorFlow

Finally, we can get TensorFlow up and running (either via Keras or directly). For this I used Anacoda, as dealing with some of the Windows-specific quirks of a regular Python installation was time-consuming. Create a virtual environment with Python 3.5 (again, Tensorflow needs this version). E.g. conda create --name cuda python=3.5. In the activated environment (activate cuda) install some additional dependencies conda install mingw libpython.

Next, install the GPU version of TensorFlow (as per the instructions, although a regular pip install tensorflow-gpu worked too)

pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/windows/gpu/tensorflow_gpu-1.2.1-cp35-cp35m-win_amd64.whl

We could install Keras at this point, but the main work is done, so that step is trivial. Try out some of the examples from either the TensorFlow or Keras websites.

Troubleshooting

During testing I encountered a few problems that took a while to find a proper fix, so I list them here with references on how to repair (or ignore) them.

“The ‘Nvidia Quadro M2000M’ device is not removable and cannot be ejected or unplugged”

When trying to use the GPU (e.g. when testing the CUDA samples or running a program with Tensorflow), a system popup appears with a message like “The ‘Nvidia Quadro M2000M’ device is not removable and cannot be ejected or unplugged”, causing the system to become unresponsive. I was able to fix this by installing the newest NVIDIA drivers from http://www.nvidia.com/Download/index.aspx and restarting. Whilst installing CUDA, an older version of the driver is installed, which seems to be causing the issue.

The TensorFlow library wasn’t compiled to use XXX instructions, but these are available on your machine and could speed up CPU computations

I chose to ignore this, by following the suggestion on Github: just deactivate the warnings

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf

Resources