This article contains a step by step detailed guideline to set up a deep learning workstation with Ubuntu 20.04. This is actually a documentation of the process I followed for the same in my computer. I repeated this process a no. of times. And every time I thought I should have documented the process. Proper documentation helps a quick and error-free set up in the next instance.
I have mentioned the most common mistakes and errors during the process and how to avoid or troubleshoot them. Bookmarking this page can help you quickly refer it whenever you get stuck in any of the steps.
I have done this complete setup process a few times in both of my old and new laptops with completely different configurations. So, hope that the problems I faced are the most common one. It took me a considerable time to fix all those issues, mainly by visiting different discussion groups like StackOverflow, Ubuntu discussion forum and many other discussion threads and blogs.
I compiled them in one place here. So that you don’t have to visit multiple sites and refer to this post only to complete the whole installation process. In this way, this documentation will save a lot of your valuable time.
Prerequisites to set up deep learning workstation
Page contents
I assume that you already have Ubuntu on your computer. If not then please install the latest version of Ubuntu. This is the most famous open-source Linux distribution and available for free download here. Although it is possible to run deep learning Keras models on Windows, it is not recommended.
Why should you use Ubuntu for deep learning? Refer to this article
Another prerequisite for running deep learning models is a good quality GPU. I will advise you to have an NVIDIA GPU in your computer for satisfactory performance. It is a necessary condition not must though. Because running sequence processing using recurrent neural network and image processing through convolutional neural models in CPU is a difficult proposition.
Such models may take hours to give results when run with CPU. Whereas a modern NVIDIA GPU will take merely 5-10 minutes to complete the models. In case if you are not interested to invest for GPU an alternative is using cloud service for computing paying hourly rent.
However, in long run, this using this service may cost you more than upgrading your local system. So, my suggestion will be if you are serious about deep learning and wish to continue with even moderate use, go for a good workstation set up.
The main steps to set up a deep learning workstation
Now I assume that you have already completed with all the prerequisites to set up your deep learning experiments. It is a little time-consuming process. You will require a stable internet connection to download various files. Depending on the internet speed the complete process may take 2-3 hours (with an internet speed of 1gbps in my case it took 2 hours) to complete. The main steps to set up a deep learning workstation are as follow:
- Updating the Linux system packages
- Installation of Python pip command. It is the very basic command going to be used to install other components
- Installing the Basic Linear Algebra Subprogram (BLAS) library required for mathematical operation.
- HDF5 data frame installation to store hierarchical data
- Installation of Graphviz to visualize Keras model
- CUDA and cuDNN NVIDIA graphics drivers installation
- Installation of TensorFlow as the backend of Keras
- Keras installation
- Installation of Theano (optional)
So, we will now proceed with the step by step installation process
Updating the Linux system packages
The following line of commands will complete the process of Linux system up-gradation process. You have to type the commands in Ubuntu terminal. The keyboard shortcut to open the terminal is “Ctrl+Alt+T”. Open the terminal and execute the following lines of code.
$ sudo apt-get update
$ sudo apt-get --assume-yes upgrade
Installing the Python-pip command
The pip command is for installing and managing Python packages. Next which ever packages we are going to install, this pip command will be used. It is an replacement of the earlier command easy_install. Run the following command to install python-pip.
$ sudo apt-get install python-pip python-dev
It should install pip in your computer. But sometimes there may be exceptions. As it happened to me also. See the below screenshot of my Ubuntu terminal. It says “Unable to locate package python-pip”.
It created a big problem as I was clueless about why it is happening. In my old computer, I have used it no. of times without any issue. After scouring the internet for several hours I got the solution. This has to do with the Python version installed in your computer.
If you are also facing the problem (most likely if using a new computer) then first check the python version with this command.
$ ls /bin/python*
If it returns python version 2 (for example python 2.7) then use python2-pip command or if it returns higher version python like python 3.8 then use python3-pip command to install pip. So, now the command will be as below
$ sudo apt-get install python3-pip
Ubuntu by default uses Python 2 while updating its packages. In case you want to use Python 3 then it needs to be explicitly mentioned. Only Python means Python 2 for Ubuntu. So, to change the Python version, use the following code.
# Installing Python3
$ sudo apt-get install python3-pip python3-dev
Installation steps for Python scientific suit in Ubuntu
Here the process discussed are for Windows and Linux Operating systems. For the Mac users they need to install the Python scientific suit via Anaconda. They can install it from the Anaconda repository. It is continuously updated document. The documentation provided in Anaconda is very vivid one with every step in detail.
Installation of the BLAS library
The Basic Liner Algebra Subprogram (BLAS) installation is the first step in setting up your deep learning workstation. But one thing Mac users should keep in mind that this installation does not include Graphviz and HDF5 and they have to install them separately.
Here we will install OpenBLAS using the following command.
$ sudo apt-get install build-essential cmake git unzip \
pkg-config libopenblas-dev liblapack-dev
Installation of Python basic libraries
In the next step, we will need to install the basic Python libraries like NumPy, Panda, PMatplotlib, SciPy etc. These are core Python libraries required for any kind of mathematical operations. So, be it machine learning or deep learning or any kind of computation intensive task, we will need these libraries.
So use the following command in Ubuntu terminal to install all these scientific suite simultaneously.
# installation of Python basic libraries
$ sudo apt-get install python-panda python-numpy python-scipy python- matplotlib python-yaml
Installation of HDF5
The Hierarchical Data Format (HDF) version 5 is an open-source file format which supports large, complex and heterogeneous data sources. It was developed by NASA to store large numeric data files in efficient binary formats. It has been created on the other two hierarchical data formats like HDF4 and NetCDF.
HDF5 data format allows the developer to organize his machine learning/deep learning data in a file directory structure very similar to what we use in any computer. This directory structure can be used to maintain the hierarchy of the data.
If we consider the directory nomenclature in the computer filing system, then the “directory” or “folder” is the “group” and the “files” are the “dataset” in case of HDF5 data format. It has importance in deep learning in order to save and fetch the Keras model from the disc.
Run the following command to install HDF5 in your machine
# Install HDF5 data format to save the Keras models
$ sudo apt-get install libhdf5-serial-dev python-h5py
Installation of modules to visualize Keras model
In the next step we will install two packages called Graphviz and pydot-ng. These two packages are necessary to visualize the Keras model. The codes for installing these two packages are as follow:
# Install graphviz
$ sudo apt-get install graphviz
# Install pydot-ng
$ sudo pip install pydot-ng
These two packages will definitely help you in the execution of the deep learning models you created. But for the time being, you can skip their installation and proceed with the GPU configuration part. Keras can also function without these two packages.
Installation of opencv package
Use the following code to install opencv package
# Install opencv
$ sudo apt-get install python-opencv
Setting up GPU for deep learning
Here comes the most important part. As you know that GPU plays an important role in deep learning modelling. In this section, we are going to set up the GPU support by installing two components namely CUDA and cuDNN. But to function properly they need NVIDIA GPU.
Although you can run your Keras model even in the CPU, it will take much longer time to train a model to compare to the time taken by GPU. So, my advice will be if you are serious about deep learning modelling, then plan to procure an NVIDIA GPU (using cloud service paying hourly rent is also an alternative).
Lets concentrate on the setting up of GPU assuming that your computer already have latest one.
CUDA installation
To install CUDA visit NIVIDIA download page following this link https://developer.nvidia.com/cuda-downloads. You will land in the following page. It will ask for selecting the OS you are using. As we are using Ubuntu here (to know why to use Ubuntu as the preferred OS read this article) so click Ubuntu.
Then it will ask other specifications of your workstation environment. Select them as per your existing specifications. Like here I have selected OS as Linux. I am using a Dell Latitude 3400 laptop which is a 64 bit computer, so in next option I selected x86_64; the Linux distribution is Ubuntu version 20.04.
Finally the installer type you have to select. Here I have selected the network installer mainly because it has comparatively smaller download size. I am using my mobile internet for the time being. So, it was the best option for me. But you can choose any of the other local installation options if there is no constrain of internet bandwidth. The plus point of local installation is you have to do this only once.
As all the specifications are mentioned, NVIDIA will provide you the installer. Copy the code from there and run in Ubuntu terminal. It will use Ubuntu’s apt to install the packages, which is the most easiest way to install CUDA.
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
$ sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
$ sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
$ sudo apt-get update
$ sudo apt-get -y install cuda
Install cuDNN
“cuDNN is a powerful library for Machine Learning. It has been developed to help developers like yourself to accelerate the next generation of world changing applications.”
NVDIA.com
To download the specific cuDNN file for your operating system and linux distribution you have to visit the NIVIDIA download page.
To download the library, you have to create an account with NVIDIA. It is a compulsory step.
Fill in the necessary fields.
As you finish registration a window with some optional settings will appear. You can skip them and proceed for the next step.
A short survey by NIVIDIA is the next step. Although it is on the experience as developer, you can fill it with any of the options just to navigate to the download page.
Now the page with several download options will appear and you have to choose according to your specifications. I have selected the following debian file for my workstation.
Download the file (the file size is around 300mb in my case). Now to install the library, first change the directory to enter in the download folder and execute the install command.
Once you are in the directory where the library has been downloaded (by default it is the download folder of your computer) run the command below. Use the filename in place of **** in the command.
$ sudo dpkg -i dpkg -i ******.deb
You can follow the installation process from this page. With this the cuDNN installation is completed.
Installation of TensorFlow
The next step is installation of TensorFlow. It is very simple. Just execute the below command to install TensorFlow without GPU support using the pip command.
# Installing TensorFlow using pip3 command for Python3
$ sudo pip3 install tensorflow
Installing Keras
This is the final step of setting up your deep learning workstation and you are good to go. You can run the simple below command.
$ sudo pip3 install keras
Or you can install it from Github too. The benefits of installing Keras from Github are that you will get lots of example codes from there. You can run those example scripts to test them on your machine. These are very good source of learning.
$ git clone https://github.com/fchollet/keras
$ cd keras
$ sudo python setup.py install
Optional installation of Theano
Installation of Theano is optional as we have already installed TensorFlow. However, installing Theano can prove advantageous while building Keras code and switching between TensorFlow and Theano. Execute the code below to finish installing Theano:
$ sudo pip3 install theano
Congratulations !!! you have finished with all installations and completed the set up for your deep learning workstation. You are now ready to execute your first code of deep learning neural network.
I hope this article will prove helpful to set up your deep learning workstation. It is indeed a lengthy article but covers all technicalities which you may need in case of any difficulty during the process. A little knowledge about every component you are installing also helps you to make any further changes in the setting.
Let me know how you find this article by commenting below. Please mention if any information I missed or any doubt you have regarding the process. I will try my best to provide the information.