Getting started with Python for Machine Learning: beginners guide

Getting started with Python for Machine Learning

If you are reading this article, then you are a Machine Learning enthusiast without any doubt. You must have already gone through the theoretical basics of it and getting impatient to try hand in your first Machine Learning application. Python is the most popular programming language for machine learning. I would suggest that if you want a carrier in data science it is Python which you should bet on for.

Learn about two main types of Machine Learning 
>Supervised machine learning
>Unsupervised machine learning

So, this article is for you. Here I will demonstrate how to complete with the setup of the Python and to start with your first simple programming.

But first of all the question is….

Why Python for machine learning?

Why I have chosen Python to carry on Machine Learning? There are lots of tools available and some of them are very popular too. For example, R is a very reputed language and also present there for a long time. 

Especially people with traditional statistical or mathematical background have a strong inclination towards R too. One of the reasons behind this popularity is R actually came into existence replacing S which was a pure statistical programming language developed on C platform and hence was hugely popular amongst statisticians.

Python Vs R

R was developed in 1992 and has a specific edge for data analysis tasks. And being a procedural language it breaks down the total tasks into a series of steps and procedures. Both of R and Python being open source are freely available to use and online resources are huge.

R is mainly helpful for core statistical and data analytics purpose. The language was developed by statisticians keeping the need for statisticians in mind mainly. It has very powerful graphical functions like ggplot, ggvis, shiny etc. If you want to create eye-catching plots from your data, R should be your best friend.

On the other hand, Python came a little early in 1989 developed by Guido Van Rossum, a Dutch scientist. It has a slow steady growth till 2010 but after that with the start of data explosion era, its popularity also shoots quickly. 

The main reason behind is so quick popularity is its simplicity and versatility. Machine Learning and Artificial Intelligence have many complex algorithms to perform several complex tasks. But the beauty of Python is that it makes tasks easy for both machine learning and AI with its vast collection of simple to use functions.

Use of Python in data science is just one of its capability. Being a general-purpose language, Python can be used for developing web applications, software, mobile applications development and even read-modifying files connecting to the database. This versatility of this language has won the heart of millions of people irrespective of whether they are data scientists or computer science enthusiasts.

If you are a beginner in data science you can jump-start the learning and application of Python even with little or no background in programming languages. It is also a far better performer compare to R when it comes to analyzing large size database.

The following chart from Economist.com will help you to realize how popular Python has become recently surpassing all other big names like Java, R, C++ etc.

Source: steelkiwi.com, economist.com

In the data science world these two programming languages are close competitors. Both of them are very popular and have their own plus and minuses. And ultimately which platform you should use is purely your choice. 

Having said this, I think the popularity and simplicity of Python in its application in machine learning will keep it slightly ahead of R. And if you are looking ahead to build a career as a data scientist, in my opinion, the future is brighter with Python skill.

Setting up Python in your computer

To start with python application, the first step is to install Python in your computer. If your desktop/laptop is a new one, then there is a chance that it might have Python preinstalled in it. You can check your start menu for it. If you get it there then skip this step.

Download Python

If it is not installed already then you have to download and install it from Python.org. 

Python for machine learning

So, from here you download the specific Python version that suits your computer and download it. As of today Python, 3.8.2 is the latest version so you can download it. And if you have an old system and run Windows XP then you have to download an old compatible version preferably lower than Python 3.5.

After you downloaded the file, click it to start the installation. Just go with the recommended installation process. It is a quick process and within minutes python is installed in your system.

Python for machine learning

The following window will appear as the Python installation is finished.

Python for machine learning

Now you can check your computer start menu and the python folder with associated applications will be there.

Python for machine learning

As I have installed it just now so it is having a “New” tag with all its application. Now as the python is installed you can directly launch its application and start your code. 

Python for machine learning

Here in the above screenshot, you can see that the console is showing all the details of the Python version installed. I have also done some basic command like print and simple calculation.

But to start with your Python coding we will need a good IDE which will help us with Python syntax writing in an intuitive way.

Selecting a Python IDE

Although while installing Python a simple IDE called IDLE gets installed automatically. We prefer to use a more popular and advanced IDE called PyCharm. The reason is to get familiar with one IDE of any programming language takes significant time. So, we should choose a good IDE to start with so that we can continue our task in it.

PyCharm is currently the most popular IDE for python. See the following table which compares some popular Python IDEs. PyCharm also comes with a paid version. But you will get full-featured integrated environment in both of them.

Python for machine learning

Source: www.softwaretestinghelp.com

Except for thesse IDEs some simple text editor like Notepad++ is also very popular amongst data scientists. The only issue with text editors is that you have to use some additional plugins to compile the code written in them. In this context IDEs come handy and you can do the complete task starting from writing code to its compilation in there itself.

Having said these, the final selection is completely on your choice. Python being the most popular programming language, users have the luxury of choosing an IDE from a vast collection of it. And honestly speaking, it is difficult to judge any single IDE to be the best one. Every one of them has its own strengths and weakness.  

So though I have selected PyCharm here, you can select any other too. It will not take you much time to switch between IDEs once you make your basics strong.

So, let’s start installing PyCharm

PyCharm is a product of Jetbrains. Open the concern page following this link

https://www.jetbrains.com/pycharm/download/#section=windows

Step 1: 

Click the download button under Community, it is the free open source version of PyCharm

Python for machine learning

Step: 2

Start the downloaded application by clicking the next button.

Python for machine learning

Step: 3

In the next step, if you want to change the program location then provide the path or you can go with the default path assigned. I am here going with the default folder. Then click next.

Python for machine learning

Step: 4

Next window will allow you to create a desktop icon of PyCharm and you can also update the path variable. To proceed click next.

Python for machine learning

Step: 5

Here you can change the start menu folder, then click “Install”.

Python for machine learning

Step: 6

The installation will start. It takes a few minutes. As the installation is done click next.

Step: 7

In the next window click “Finish” to complete the installation process.

Starting PyCharm for the first time

Now PyCharm is installed on your computer. Go to your computer start menu and launch the programme. The first window appears is of the privacy policy. Click on to agree with the terms & conditions and click continue.

Next is the data sharing window. It’s completely your choice. Choose any of the options and proceed.

In the next window, you will get to choose the appearance of your IDE. Choose any of them you feel comfortable with and click skip remaining and set defaults. You can change all these options anytime you want later.

The next window is important which allows you to choose the location where you want to create your Python project. For me, I like to save all my important files at the cloud, so I have provided that particular path there. You can change it here or later.

So now you are all set to start your journey with Python programming with PyCharm IDE. 

References:

  • https://www.python.org
  • https://towardsdatascience.com
  • https://www.geeksforgeeks.org
  • https://steelkiwi.com/blog