Running TensorFlow

Running TensorFlow on CPU

Install pre-built TensorFlow from Anaconda.

$ module load miniconda3/4.5
$ conda config --add channels https://mirrors.sjtug.sjtu.edu.cn/anaconda/pkgs/free/
$ conda config --add channels https://mirrors.sjtug.sjtu.edu.cn/anaconda/pkgs/main/
$ conda config --set show_channel_urls yes
$ conda create --name tf-py3-cpu pip python=3.6
$ source activate tf-py3-cpu
$ pip install tensorflow

Verify the installation by loading TensorFlow module.DO NOT run TenorFlow jobs on login nodes.

$ python -c 'import tensorflow as tf; print(tf.__version__)'
1.8.0

Log out and log in again, then request an interactive SLURM job to run TensorFlow.

$ module purge
$ srun -p k80 -N 1 --exclusive --pty /bin/bash
hostname
module load miniconda3/4.5
source activate tf-py3-cpu
python -c 'import tensorflow as tf; print(tf.__version__)'

Prepare a job script tfcpu.slurm with contents as follows.

#!/bin/bash

#SBATCH -J tensorflow-cpu
#SBATCH -p k80
#SBATCH --mail-type=end
#SBATCH --mail-user=YOU@EMAIL.COM
#SBATCH -o %j.out
#SBATCH -e %j.err
#SBATCH -n 1
#SBATCH --exclusive

source /usr/share/Modules/init/bash
module purge
module load miniconda3/4.5

source activate tf-py3-cpu
python -c 'import tensorflow as tf; print(tf.__version__)'

Then submit this job to SLURM. Please refer to https://pi.sjtu.edu.cn/doc/slurm for SLURM usage.

$ sbatch tfcpu.slurm

Running TensorFlow on GPU

Install pre-built GPU TensorFlow from Anaconda.

$ module load miniconda3/4.5
$ conda config --add channels https://mirrors.sjtug.sjtu.edu.cn/anaconda/pkgs/free/
$ conda config --add channels https://mirrors.sjtug.sjtu.edu.cn/anaconda/pkgs/main/
$ conda config --set show_channel_urls yes
$ conda create --name tf-py3-gpu pip python=3.6
$ source activate tf-py3-gpu
$ pip install tensorflow-gpu

DO NOT run GPU TenorFlow jobs on login nodes since there is no GPU cards there. Log out and log again, then request an interactive SLURM job to run TensorFlow.

$ module purge
$ srun -p k80 -N 1 --exclusive --gres=gpu:2 --pty /bin/bash
hostname
module load miniconda3/4.5 cuda/9.0 cudnn/7.0
source activate tf-py3-gpu
python -c 'import tensorflow as tf; print(tf.__version__)'
1.8.0

Prepare a job script tfgpu.slurm with contents as follows.

#!/bin/bash

#SBATCH -J tensorflow-cpu
#SBATCH -p k80
#SBATCH --mail-type=end
#SBATCH --mail-user=YOU@EMAIL.COM
#SBATCH -o %j.out
#SBATCH -e %j.err
#SBATCH -N 1
#SBATCH --exclusive
#SBATCH --gres=gpu:2

source /usr/share/Modules/init/bash
module purge
module load miniconda3/4.5 cuda/9.0 cudnn/7.0

source activate tf-py3-gpu
python -c 'import tensorflow as tf; print(tf.__version__)'

Then submit this job to SLURM. Please refer to https://pi.sjtu.edu.cn/doc/slurm for SLURM usage.

$ sbatch tfgpu.slurm