Installing TensorFlow on Pi Supercomputer

· Read in about 3 min · (572 Words)

Building GPU-TensorFlow on CentOS 6 Platform

Login to CentOS 6 login nodes via 202.120.58.229 or 202.120.58.230. Then load necessary modules.

$ module purge
$ module load gcc/5.4 python/3.5 bazel cuda/8.0 cudnn/5.1

Setup python library for TensorFlow.

$ wget --no-check-certificate "https://github.com/pypa/virtualenv/archive/15.0.2.zip" -O virtualenv.zip
$ unzip virtualenv.zip
$ cd virtualenv-*
$ rm -rf ~/.pip ~/python35-gcc49
$ python3 virtualenv.py ~/python35-gcc49
$ source ~/python35-gcc49/bin/activate

Setup HTTP proxy:

$ export http_proxy=http://proxy.pi.sjtu.edu.cn:3004/; export https_proxy=http://proxy.pi.sjtu.edu.cn:3004/

Install numpy dependency:

$ pip3 install numpy

Download TensorFlow v1.0 to home dirctory.

$ git clone https://github.com/tensorflow/tensorflow
$ cd tensorflow
$ git checkout v1.0.1

Configuration for TensorFlow (GPU)

$ ./configure
Input is shown as below.
Please specify the location of python. [Default is $HOME/python35-gcc49/bin/python]: (Default)
Please specify optimization flags to use during compilation [Default is -march=native]: (Default)
Do you wish to use jemalloc as the malloc implementation? (Linux only) [Y/n] Y
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] N
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] N
No XLA JIT support will be enabled for TensorFlow
Found possible Python library paths:
  [$HOME/python35-gcc49/lib/python3.5/site-packages
Please input the desired Python library path to use.  Default is [$HOME/python35-gcc49/lib/python3.5/site-packages] (Default)
Using python library path: /lustre/home/acct-hpc/hpcstephen/python35-gcc49/lib/python3.5/site-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] N
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /lustre/usr/gcc/5.4.0/bin/gcc]:
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /lustre/usr/cuda/8.0
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 6
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /lustre/usr/cuda/8.0]: /lustre/usr/cudnn/6.0-cuda8.0
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.5,6.0

Build and install TensorFlow in your own python library.

$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/`whoami`/tensorflow_pkg
$ pip install /tmp/`whoami`/tensorflow_pkg/tensorflow-1.0.1-cp35-cp35m-linux_x86_64.whl

Installing Pre-built GPU-TensorFlow on CentOS 7 Platform

Login to CentOS 7 node via SSH server 202.120.58.231. Then install pre-built TensorFlow from Anaconda.

$ module load anaconda/3 cuda/8.0 cudnn/5.1
$ conda create --name deeplearning numpy
$ source activate deeplearning
$ pip install tensorflow-gpu==1.1.0 keras numpy

For GPU-TensorFlow jobs, the following lines are required in the job script:

source /usr/share/Modules/init/bash
unset MODULEPATH
module use /lustre/usr/modulefiles/pi
module purge
module load anaconda/3 cuda/8.0 cudnn/5.1
source activate deeplearning

SLURM partitions centos7gpu, centos7k40, centos7k80 and centos7p100 are ready for these jobs.

To use tensorflow module in iPython or Jupyter, it is required to set the library path first. Please replace HOME and ENVNAME with your home dir and anaconda environment name separately.

$ ipython
In [1]: import sys, pprint
In [2]: pprint.pprint(sys.path)
['',
...libraries from pip are missing...
]
In [3]: sys.path.append('/HOME/.conda/envs/ENVNAME/lib/python3.6/site-packages')
In [4]: import tensorflow

Reference