Using a GPU
In this tutorial we're going to: - Start an instance - Install NVIDIA drivers - Run a Tensorflow becnhmark
Start an Instance
We are going to create an Ubuntu 20.04 based Virtual Machine instance. You can start an instance either using the OpenStack CLI, or using the Horizon web interface.
On the CLI:
nova boot --flavor eg1.a30x1.V8-32 --key-name my_key --nic net-name=external my-gpu-instance --block-device source=image,dest=volume,size=30,id=0e98efbf-de1e-4e58-bf77-1ca80668e305,shutdown=preserve,bootindex=0
Replace my_key
with the name of a key pair from your account.
This will start a new Ubuntu 20.04 VM, with 8 CPU cores, 32GB of RAM and a single NVIDIA A30 Tensor Core GPU. We have flavors with up to 8 of these GPUs.
You can obtain your IP address by running:
openstack server show my-gpu-instance
We SSH into our VM:
ssh ubuntu@<ip_address>
Replace <ip_address>
with the ip address of your VM.
Install drivers
The next step is to install the NVIDIA drivers. To make sure we install the correct version of the drivers, we will check the current Tensorflow CUDA and cuDNN compatibility: https://www.tensorflow.org/install/source#gpu We will use CUDA 11.2 and cuDNN 8.1
Then we will install the GPU drivers and CUDA on the VM:
sudo apt update
sudo apt install gcc make
wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda_11.2.0_460.27.04_linux.run
sudo sh cuda_11.2.0_460.27.04_linux.run
Only install the cuda toolkit
Install the nvidia driver:
sudo apt install nvidia-driver-460-server
Add the libraries to environment variables.
At the bottom of the ~/.bashrc
file add
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2/lib64
export PATH=$PATH:/usr/local/cuda-11.2/bin
nano ~/.bashrc
source ~/.bashrc
We will download cuDNN 8.1 runtime library from https://developer.nvidia.com/rdp/cudnn-archive Then we SCP it to our server:
scp ./libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb ubuntu@45.135.56.94:libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb
There we will install it:
sudo apt install ./libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb
and reboot the VM
sudo reboot
Check that the drivers are correctly installed and active:
nvidia-smi
Run a Tensorflow benchmark
First we'll create a virtual env
sudo apt install python3-venv
python3 -m venv venv
source venv/bin/activate
Then we'll install the tensorflow package.
pip install wheel
pip install tensorflow
clone the becnhmarks repo:
git clone https://github.com/tensorflow/benchmarks.git
cd benchmarks/scripts/tf_cnn_benchmarks
run the benchmark:
python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=32 --model=resnet50 --variable_update=parameter_server