Installing Tensorflow GPU on Fedora Linux
Following on from my previous notes on building Tensorflow for a GPU on Fedora, I find myself back at it again. I recently upgraded my GPU at home and time has moved on too so this is my current set of notes for what I'm doing with Tensorflow on Fedora. This method, however, differs from my previous notes in as much as I'm using the pre-built Tensorflow rather than building my own. I've found that Tensorflow is so brittle during the build process it's much easier to work with pre-built binaries and set up my system to match their build.
In my previous blog post I benchmarked the CPU versus GPU using the Keras MNIST CNN example and so I thought it would be interesting to offer the same for this new install on my home machine. The results are :
Previously, I've always used the Negativo17 repository for all my NVidia driver and CUDA needs. However, the software versions available there are too up-to-date to allow Tensorflow GPU to be installed in a way that works. This repository provides CUDA 10.1 where as Tensorflow, currently at version 1.14, only supports CUDA 10.0. So we must use another source for the NVidia software that provides back-level versions. Fortunately, there is an official NVidia repository providing drivers and CUDA for Linux, so let's use that since it also works quite nicely with the RPM Fusion repositories as well. Hence, this method relies purely on RPM Fusion and the official NVidia repository and does not require or use the Negativo17 repository (although it would be possible to do so).
Install Required NVidia Driver
The RPM Fusion NVidia instructions can be used here for more detail, but in brief simply install the display drivers:
In my previous blog post I benchmarked the CPU versus GPU using the Keras MNIST CNN example and so I thought it would be interesting to offer the same for this new install on my home machine. The results are :
- 12 minutes and 14 seconds on my CPU
- 1 minutes and 14 seconds on my GPU
That's just over 9.9 as fast on my GPU as my CPU!
Some info on my machine and config:
Some info on my machine and config:
- Custom Built Home PC
- Intel Core i5-3570K CPU @ 3.40GHz (4 cores)
- 16GB RAM
- NVidia GeForce GTX 1660 (CUDA Compute Capability 7.5)
- Fedora 30 Workstation running kernel 5.2.9-200.fc30.x86_64
Previously, I've always used the Negativo17 repository for all my NVidia driver and CUDA needs. However, the software versions available there are too up-to-date to allow Tensorflow GPU to be installed in a way that works. This repository provides CUDA 10.1 where as Tensorflow, currently at version 1.14, only supports CUDA 10.0. So we must use another source for the NVidia software that provides back-level versions. Fortunately, there is an official NVidia repository providing drivers and CUDA for Linux, so let's use that since it also works quite nicely with the RPM Fusion repositories as well. Hence, this method relies purely on RPM Fusion and the official NVidia repository and does not require or use the Negativo17 repository (although it would be possible to do so).
Install Required NVidia Driver
The RPM Fusion NVidia instructions can be used here for more detail, but in brief simply install the display drivers:
- dnf install xorg-x11-drv-nvidia akmod-nvidia xorg-x11-drv-nvidia-cuda
- dnf install vdpauinfo libva-vdpau-driver libva-utils nvidia-modprobe
Install Required NVidia CUDA and Machine Learning Libraries
This step relies on using the official nvidia repositories with a little more information available in the RPM Fusion CUDA instructions.
First of all, add a new yum configuration file. Copy the following to /etc/yum.repos.d/nvidia.repo:
[nvidia-cuda]
name=nvidia-cuda
enabled=1
gpgcheck=1
gpgkey=http://developer.download.nvidia.com/compute/cuda/repos/fedora27/x86_64/7fa2af80.pub
exclude=akmod-nvidia*,kmod-nvidia*,*nvidia*,nvidia-*,cuda-nvidia-kmod-common,dkms-nvidia,nvidia-libXNVCtrl
[nvidia-machine-learning]
name=nvidia-machine-learning
baseurl=http://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/
enabled=1
gpgcheck=1
gpgkey=http://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/7fa2af80.pub
exclude=libcudnn7*.cuda10.1,libnccl*.cuda10.1
enabled=1
gpgcheck=1
gpgkey=http://developer.download.nvidia.com/compute/cuda/repos/fedora27/x86_64/7fa2af80.pub
exclude=akmod-nvidia*,kmod-nvidia*,*nvidia*,nvidia-*,cuda-nvidia-kmod-common,dkms-nvidia,nvidia-libXNVCtrl
[nvidia-machine-learning]
name=nvidia-machine-learning
baseurl=http://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/
enabled=1
gpgcheck=1
gpgkey=http://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/7fa2af80.pub
exclude=libcudnn7*.cuda10.1,libnccl*.cuda10.1
Note that the configuration above deliberately targets the fedora27 repository from NVidia. This is because it is the location at which we can find CUDA 10.0 compatible libraries rather than CUDA 10.1 libraries that will be found in later repositories. So the configuration above is likely to need to change over time but essentially the message here is that we can match the version of CUDA required by targeting the appropriate repository from NVidia. These libraries will be binary compatible with future versions of Fedora so this action should be safe to do for some time yet.
With the following configuration in place we can now install CUDA 10.0 and the machine learning libraries required for Tensorflow GPU support and all of the libraries get installed in the correct places that Tensorflow expects.
To install, run:
- dnf install cuda libcudnn7 libnccl
Install Tensorflow GPU
The final piece of the puzzle is to install Tensorflow GPU which is now as easy as:
- pip3 install tensorflow-gpu