How to create your first machine learning project: a comprehensive guide

Your first machine learning project

This article is to help you to start with your first machine learning project. Machine learning projects are very important if you are serious about your career as a data scientist. You need to build your profile with a number of machine learning projects. These projects are evidence of your proficiency and skill in this field.

The projects are not necessarily only complex problems. They can be very basic with simple problems. What is important is to complete them. Ideally, in the beginning, you should take a small project and finish it. It will boost your confidence as you have successfully completed it as well as you will get to learn many new things.

So, to start with I have also selected a very basic problem which is the classification of Iris data set. You can compare it with the very basic “Hello world” program that every programmer writes as a beginner. The data set is small that’s why easy to load in your computer; consists of a few no. of features only so implementation of any ML algorithm is easier.

I have used here Google Colab to execute the Python code. You can try any IDE you generally use. Feel free to copy the code given here and execute them. The first step is to use the existing code without any error. Afterwards, make little changes to see how the output gets affected or gives errors. This is the most effective way to know a new language as well as its application in Machine Learning.

The steps for first machine learning project

So, without much ado, lets jump to the project. You first need to chalk out the steps of implementing the project.

  • Importing the python libraries
  • Importing and loading the data set
  • Exploring the data set to have a preliminary idea about the variables
  • Identifying the target and feature variables and the independent-dependent relationship between them
  • Creating training and testing data set
  • Model building and fitting
  • Testing the data set
  • Checking model performance with comparison metrics

This is an ideal sequence how you should proceed with the project. As you gain experience you will not have to remember them. Being the first machine learning project I felt it necessary to mention them for further reference.

Importing the required libraries

# Importing required libraries
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
import numpy as np

About the data

The data is collected from UCI machine learning repository, Iris data set and created by Dr R. A. Fisher. It contains three Iris species viz. “Setosa”, “Versicolor” and “Virginica” and four flower feature namely petal length, petal width, sepal length and sepal width in cm. Each of the species represents a class and has 50 samples each in the data set. So the Iris data has total 150 samples.

This is the most popular and basic data used in pattern recognition to date. The data source is UCI machine learning repository and it is a little different from the same Iris data set found in R.

The following line of code will load the data set in your working environment.

# Loading the data set
dataset = load_iris()

The following code will generate a detail description of the data set.

# Printing some data features

Description of Iris data

.. _iris_dataset:

Iris plants dataset

**Data Set Characteristics:**

    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
    :Summary Statistics:

    ============== ==== ==== ======= ===== ====================
                    Min  Max   Mean    SD   Class Correlation
    ============== ==== ==== ======= ===== ====================
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)
    ============== ==== ==== ======= ===== ====================

    :Missing Attribute Values: None
    :Class Distribution: 33.3% for each of 3 classes.
    :Creator: R.A. Fisher
    :Donor: Michael Marshall (
    :Date: July, 1988

The famous Iris database, first used by Sir R.A. Fisher. The dataset is taken
from Fisher's paper. Note that it's the same as in R, but not as in the UCI
Machine Learning Repository, which has two wrong data points.

This is perhaps the best known database to be found in the
pattern recognition literature.  Fisher's paper is a classic in the field and
is referenced frequently to this day.  (See Duda & Hart, for example.)  The
data set contains 3 classes of 50 instances each, where each class refers to a
type of iris plant.  One class is linearly separable from the other 2; the
latter are NOT linearly separable from each other.

.. topic:: References

   - Fisher, R.A. "The use of multiple measurements in taxonomic problems"
     Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to
     Mathematical Statistics" (John Wiley, NY, 1950).
   - Duda, R.O., & Hart, P.E. (1973) Pattern Classification and Scene Analysis.
     (Q327.D83) John Wiley & Sons.  ISBN 0-471-22361-1.  See page 218.
   - Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
     Structure and Classification Rule for Recognition in Partially Exposed
     Environments".  IEEE Transactions on Pattern Analysis and Machine
     Intelligence, Vol. PAMI-2, No. 1, 67-71.
   - Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule".  IEEE Transactions
     on Information Theory, May 1972, 431-433.
   - See also: 1988 MLC Proceedings, 54-64.  Cheeseman et al"s AUTOCLASS II
     conceptual clustering system finds 3 classes in the data.
   - Many, many more ...

Checking the data type

We can check the data type before proceeding for analytical steps. Use the following code for checking the data type:

# Checking the data type 

Now here is a problem with the data type. Check the output below, it says it is a sklearn data.

Data type

Although the most common data type we are used to is Pnadas dataframe. And also the target and feature are stored here separately. You can print them separately using the following lines.

# Printing the components of Iris data

See the print output below. The target variables are the three Iris species “Setosa”, “Versicolor” and “Virginica” which are coded as 0,1 and 2 respectively. And the features are also stored separately.

First machine learning project: Components of Iris data set
Components of Iris data set

And the feature values are stored separately as data. Here is first few rows of the data.

# Printing the feature data
First machine learning project data set view

Converting the data type

For the ease of further modelling process, we need to convert the data type from sklearn to the most common Pandas data type. And we also need to concatenate the separate data and target with column names as feature_names and target. The np.c_ function concatenates the data set.

# Converting scikit learn dataset to a pandas dataframe
import pandas as pd
df = pd.DataFrame(data= np.c_[dataset['data'], dataset['target']],columns= dataset['feature_names'] + ['target'])

See below few lines of the combined dataframe. With this new dataframe we are now ready to proceed for the next step.

Panda data frame for your First machine learning project
The new Panda dataframe

Check the shape of the newly created dataframe as I have done below. The output confirms that the dataframe is now complete with 150 samples and 5 columns.

# Printing the shape of the newly created dataframe

Creating target and feature variables

Next, we need to create variables storing the dependent and independent variables. Here the target variable Iris species is dependent on the feature_variables so the flower properties i.e. petal width, petal length, sepal length and sepal width are independent variables.

The data set printed above, you can see that the first four columns are independent variables and the last one has the dependent variable. So, in the below line of codes, variable x is to store the values of first four columns and y for the target variable.

# Creating target and feature variables

The shape of x and y is as below.

Shape of x and y

Splitting the data set

We need to split the data set before applying Machine learning algorithms. The train_test_split() function of sklearn has been used here to do the task. The test data size is set as 20% of the data.

# Splitting the data set into train and test set
x_train, x_test, y_train, y_test=train_test_split(x,y,test_size=0.2,random_state=0)

Accordingly, the train data set contains 120 sample data whereas the test data set has 30 sample data.

Application of Decision tree algorithm

So, we have finished with data processing steps and ready to apply the Machine Learning algorithm. I have chosen here a very popular classification algorithm which is Decision Tree algorithm for the first machine learning project.

If this algorithm is new to you, you can refer to this article to learn details about it and how it can be applied with Python. The speciality of this ML algorithm is that its logic is very simple and the process is not black box like most other ML algorithms. Which means that we can see and understand how the decision-making process is going on.

So let’s apply this ML model to the training set of Iris data. The DecisionTreeClassifier() of sklearn is the function here which we have imported in the beginning.

# Application of Decision Tree classification algorithm
# Fitting the dt model, y_train)

The model thus applied on the training set. In the below screenshot of my Colab notebook you can see the classifier has several parameters specifying the decision tree formation. At this stage you don’t need to bother about all these specifications. We can discuss each of them and what is their function in another article.

DecisionTreeClassification model fit:First machine learning project
Fitting the Decision Tree Classification model

Prediction using the trained model

To test the model we will first create a new data. As this data has not been used in model building so the prediction will not be biased.

# Creating a new feature set i.e. a new flower properties
x_new = np.array([[4.9,	3.0,	1.4,	0.2]])
# Predicting for the new data using the trained model
prediction = knn.predict(x_new)

See the prediction result using the trained Decision Tree classifier. It gives the result as 0 which represents the iris species “Setosa”. We have discussed before the Iris species are represented in the data frame with digits 0,1 and 2.

Prediction for the new data

Lets try to predict the result using the test set with 20% of data kept independent while model training. We will also use two metrics suggesting the goodness of fit of the model.

y_pred = dt.predict(x_test)
print("Predictions for the test set:",y_pred)
# Metrics for goodness of fit 
print("np.mean: ",np.mean  (y_pred == y_test))
print("dt.score:", dt.score(x_test, y_test))

And the output of the above piece of code is as below.

Prediction using the test set

You can see that the testing accuracy score is 1.0!. So, it is indicating a problem. The problem of overfitting. Which is very common with Decision Tree Classification. Overfitting suggests that the model is a too good fit for this particular data set. Which is not desirable. And ideally, we should try other machine learning models to check their performance.

So in this section next we will not take up a single ML algorithm, rather we will take up a bunch of ML algorithms and test their performance side by side to choose the best performing one.

Application of more than one ML models simultaneously

In this section, we will fit multiple ML algorithms at a time to classify the Iris data and see which one of them is the most accurate. The ML algorithms we will use here are Linear Discriminant Analysis, Naive Bayes classifier, Logistic regression, Support Vector Machine, K-Nearest Neighbour classifier and also Decision tree classifier which we have already applied before. Here I am including it too just to compare it with the others.

Along with these ML models another segment which I am going to introduce is known as Ensemble models. The specialty of this method is that an ensemble model uses more than one machine learning models at a time to achieve more accurate estimation. See the below figure to understand the process.

A schematic diagram of ensemble method
An ensemble model

Now there are two kinds of ensemble models which are Bagging and Boosting. I have incorporated both kinds of ensemble models here to compare them with other machine learning algorithms. Here is a brief idea about Bagging and Boosting ensemble techniques.


The name is actually Bootstrap Aggregation. It is essentially a random sampling technique with replacement. That means here once a sample unit is selected, it is again replaced back for further future selection. This method works best with algorithms which tend to have higher variance and bias, like decision tree algorithm.

Bagging method runs a different model separately and for the final prediction output aggregates each model’s estimation without any bias to any model.

The other ensemble modelling technique is:


As an ensemble learning method, boosting also comprises a number of modelling algorithm for prediction. It associates weight to make a weak learning algorithm stronger and thus improving the prediction. The learning algorithms also learn from each other to boost the overall model performance.

The ensemble models we are going to use here are AdaBoostClassifier(), BaggingClassifier(), ExtraTreesClassifier(), GradientBoostingClassifier() and RandomForestClassifier(). All are from sklearn library.

Importing required libraries

# Importing libraries
from sklearn.model_selection import cross_val_score
from sklearn import ensemble
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
import matplotlib.pyplot as plt
import seaborn as sns

Application of all the models

Use this following lines of code to build, train and execute all the six models. It also consists of an array with name ml_compare[]. It stores all the comparison matrices calculated here.

# Application of all the ML algorithms at a time
ml = []
ml.append(('LDA', LinearDiscriminantAnalysis())),
ml.append(('DTC', DecisionTreeClassifier())),
ml.append(('GNB', GaussianNB())),
ml.append(('LR', LogisticRegression(solver='liblinear', multi_class='ovr'))),
ml.append(('SVM', SVC(gamma='auto'))),
ml.append(('KNN', KNeighborsClassifier())),
ml.append(("Ensemble_AdaBoost", ensemble.AdaBoostClassifier()))
ml.append(("Ensemble_Bagging", ensemble.BaggingClassifier()))
ml.append(("Ensemble_Extratree", ensemble.ExtraTreesClassifier()))
ml.append(("Ensemble_GradientBoosting", ensemble.GradientBoostingClassifier()))
ml.append(("Ensemble_RandomForest", ensemble.RandomForestClassifier()))

# Model evaluation
for name, model in ml:,y_train)
  kfold = StratifiedKFold(n_splits=10, random_state=1, shuffle=True)
  cv_results = cross_val_score(model, x_train, y_train, cv=kfold, scoring='accuracy')
  ml_compare.loc[row_index, 'Model used']=name
  ml_compare.loc[row_index,"Cross Validation Score" ]=round(cv_results.mean(),4)
  ml_compare.loc[row_index,"Cross Value SD" ]=round(cv_results.std(),4)
  ml_compare.loc[row_index,'Train Accuracy'] = round(model.score(x_train, y_train), 4)
  ml_compare.loc[row_index,"Test accuracy" ]=round(model.score(x_test, y_test),4)



As all the models get trained and executed with the train set, they are simultaneously tested with the test data. The goodness of fit statistics gets stored in ml_compare[]. So, let’s see now what ml_compare[] tells us. The output is as below.

Comparative table of cross validation score of all the models

Visual comparison of the models

Although from the above table the models can be compared, it is always easier if there is a way to visualize the difference. So, let’s create a bar chart using the cross-validation score. we have calculated above. Use the following line of codes to create the bar chart with the help of matplotlib and seaborn module of sklearn.

# Creating plot to show the train accuracy
sns.barplot(x="Model used", y="Train Accuracy",data=ml_compare,palette='hot',edgecolor=sns.color_palette('dark',7))
plt.title('Model Train Accuracy Comparison')

As the above code executes, the following bar chart is created showing the cross-validation scores of all the ML algorithms.

The verdict

So, we have classified the Iris data using different types of Machine Learning and ensemble models. And the result shows that they all are more or less accurate in identifying the Iris species correctly. But if still, we need to pick any one of them as the best, then we can do that based on the above comparative table as well as the graph.

For this instance, we have Linear Discriminant and Support Vector Machine performing slightly better than the others. But it can vary depending on the size of data and ML scores do change in different executions. You also check your result, which one you have found best and let me know through comments below.


So, congratulations you have successfully completed your very firs machine learning project with python. You have used a popular and classic data set to apply several machine learning algorithms. The data being a multiclass data set is an ideal example of real world classification problem.

To find out the best performing model, we have applied the six most popular Machine Learning algorithms along with several ensemble models. To start with the model building process, first of all, the data set has been divided into training and testing sets.

The training set is to build and train the model. The test data set is an independent data set kept aside while building the model, to test the model’s performance. This is an empirical process of model validation when independent data collection is not possible. For this project, we have taken an 80:20 ratio for train and test data set.

And at the last a no. of comparison metrics were used to find the model with the highest accuracy. These are essentially the ideal steps of any machine learning project. As it is your first machine learning project experience, so I have showed every step with all details. As you advance in experience you may skip some of them as per your convenience.

So, please let me know your experience with the article. Any problem you faced while executing the code or any other queries post them in the comment section below, I will love to answer them.


Machine Learning: Some lesser known facts

Machine Learning

Machine Learning (ML) has become a buzz word in today’s world. Although we can have its references since the middle of the twentieth century it has gained its popularity during the last few years. Mainly because of its immense capability to explore a large amount of data without the need for any programming and hence the simplicity to use.

Since Machine learning is still a new concept and there are several doubts and misconception about it. In this article, I will try to explore some of these facts that are less known about Machine Learning along with very basic ideas like what is Machine Learning and how it is making our lives better.

Let’s start with a famous conversation of an interview to hire a Machine Learning expert. You must have read this before but I like this so much and it can give a good start to this article. 

So as the interview starts, the interviewer starts asking questions to the candidate:

Interviewer: What is your specialization?

Candidate: Machine Learning

Interviewer: What is 23+34?

Candidate: It’s 10

Interviewer: No, wrong answer, its 57

Candidate: It’s 35

Interviewer: No, wrong answer again, it’s 57

Candidate: It’s 50

Interviewer: No, the answer is still 57

Candidate: It’s 57

Interviewer: You are hired !!!

Although it is a joke, to some extent it reflects how the Machine Learning works. Machine Learning is all about learning from the data it is fed with. Here is a famous quote from Thomas H. Davenport, Analytics thought-leader from the Wall Street Journal which reflects the power of Machine Learning;

“Human can create one or two good models a week; Machine Learning can create thousands of good models a week”

Thomas H. Davenport, Analytics thought-leader from the Wall Street Journal

Importance of Machine Learning in the present context

Today we have a huge amount of data popularly known as big data. This can be a gold mine of knowledge if used and explored properly. Data mining, Baysian analysis all these are getting popular only because they also cater to extract information from a big pile of data. 

As the volume of data increased, so its complexity. The data comes from varieties of sources, consists of numerous fields. We need modelling techniques which can analyze such kind of data quickly with improved accuracy. So here is Machine Learning for you.

So, what is Machine Learning ?

Machine learning in simple term is converting knowledge from information. We have a huge amount of data in our custody, generated throughout a period over more than 50 years. If it is not used to generate knowledge out of it then this huge volume of data is of no use and we are just scrapping a very valuable resource that can help solve many challenges of humanity.

It is as such a very vast field of data science and assimilates many concepts of other associated fields like Artificial Intelligence.

The beauty of Machine Learning is that it does not need programming by human rather as the name suggests it learns from the data it was fed. In this sense, it is similar to a human who also learns from their past experiences.

This learning comes through a rigorous process of observing the data, finding out the pattern in order to minimize the difference between actual and estimation. 

Machine Learning has three main categories, which are

Application of Machine Learning?

Application of Machine Learning
Use of Machine Learning
Photo by Andy Kelly on Unsplash

Recent advances made in Machine Learning enables computer some of the tasks which can only be handled by human until very recent time. In our daily life, we take help or use applications which use this technique and most of the time we don’t even know that it is Machine Learning which is making our lives easier.

In daily life

We can take a simple example of getting personalised Google news. This application which type of news you are interested in by keeping an account of your likes and dislikes as you time to time input in Google’s database. The same technique is used by Facebook to suggest you groups or pages that you may like. Ever wonder how your email service identifies spam emails for you and discriminates from important mails, thanks to ML.

Online video streaming services like Netflix, Amazon Prime, Hotstar etc. or music streaming applications like Spotify all of them have a nice feature which automatically populates your account with contents you prefer. Here the essence is Machine Learning; it analyzes your popular choices and suggests content according to your choice.

Image/speech recognition & medical research

Image recognition uses this technology to answer whether an animal is a cat or dog, identifying persons crossing the road, identifying your handwriting and converting into texts and many more.

In a similar way converting voice into text which is predominantly in use in several platforms like speech to text tool in Google doc and here also ML plays an important role.

In medical research, ML is a fast-growing technology. It helps in analyzing voluminous data and to identify trends and patterns.  Especially with the advent of wearable devices and sensors which keep track of vital parameters of patient’s health. The data generated by these devices are analyzed through ML often in real-time to enable medical practitioners to detect any trend and red flag any symptom for better diagnosis. 

Oil and gas sector

In this sector, ML finds its use to identify natural resources like minerals under the ground, pointing out any risk involved in the performance of the refinery sensors and chance of failure, also preparing an optimized oil distribution plan to make it more cost-effective and efficient.

Thus almost in every sector of our society, the use of Machine Learning is rapidly expanding. In absence of Machine Learning, performing such a resource-intensive and time-consuming process would not be at all feasible in traditional ways.

Futuristic applications

Few applications of Machine Learning which are still in the testing phase, are always been the popular topics of science fiction stories. We are now frequently hearing and reading about self-driving cars of Google or Tesla. This is already a reality now, but go back 10 years, such a concept used to be a subject of science fiction only. The basic concept behind this revolutionary invention is Machine Learning.

Almost every industry who deal with a large amount of data has realized the importance of Machine Learning. Be it banking and finance sector, automobile, research or health care sector ML enables them to work more efficiently and have an edge over their competitors with the help of data insights often in real-time. 

So, what is Artificial Intelligence (AI) then?

If you have read up to this, then this question is most probably rising in your mind and it is bound to. Although most of the times we use the terms AI and Machine Learning interchangeably they are not the same. AI makes machines to emulate human intelligence whereas ML helps machines to learn from data.

Read this article for a brief about Artificial Intelligence (AI)

Artificial Neural Network (ANN) as its name suggests it mimics the neural network of our brain hence it is artificial. The human brain has a highly complicated network of nerve cells to carry the sensation to its designated section of the brain. The nerve cell or neurons form a network and transfer the sensation one to another. Similarly in ANN also a number of inputs pass through several layers similar to neurons and ultimately produce an estimation.

Schematic diagram of Artificial Neural Network
Schematic diagram of Artificial Neural Network

Machine Learning is a way to implement Artificial Intelligence. Machine Learning has been in application since decades but in recent days as Artificial Intelligence came into action Machine Learning, to be more specific Deep Learning has become more popular.

ANN: a deep learning process

ANN is a deep learning process, the burning topic of data science. Deep learning is basically a subfield of Machine Learning. You may be familiar to the machine learning process and if not you can refer to this article for a quick working knowledge on it. Talking about deep learning, it is in recent times find its application in almost all ambitious projects. Starting from basic pattern recognition, voice recognition to face recognition, self-driving car, high-end projects in robotics and artificial intelligence deep learning is revolutionizing the modern applied science.

Read about supervised machine learning here

ANN is a very efficient and popular process of pattern recognition. But the process involves complex computations and several iterations. The advent of high-end computing devices and machine learning technologies have made our task much easier than ever. Users and researchers can now focus only on their research problem without taking the pain of implementing a complex ANN algorithm.

The concept of Artificial Intelligence although not very new, it was first used in 1950 and was supposed to use a computer to perform such activities which can only be done by human beings only.

So in that sense, AI is a much broader concept and ML can be considered as a subset of it. AI is as a whole mimics the concept of human intelligence and to achieve it ML plays a very important role by extracting information from data without the need for programming.

Machine Learning Vs Deep Learning Vs Data Mining

Often these three concepts are little confusing and the main reason is all these techniques have the same goal, which is to get an insight, relationship or trend of the data in hand. But they differ in their execution and abilities. 

Machine Learning

As we discussed, Machine Learning functions more like statistical models, where there is a mathematically proven strong theory about the distribution of the data and it is assumed that the data fulfil some assumptions too. The advantage of Machine Learning is that even if we do not have any theoretical idea about the distribution of the data it can learn from the data through several iterations until the best pattern is found. Hence, the process of ML can be easily automated too.

Data Mining

It is a much broader concept with the same objective as ML and encompassed a variety of concepts to achieve that. Like deep learning uses traditional statistical theories, text analytics, time series algorithm, data manipulation techniques and even Machine Learning too in order to identify an underlying pattern in the data. 

Deep Learning

It is a more advance concept compare to the above two. Deep learning involves the state of the art technologies combining modern high-end computing and neural networks to identify complex patterns in a large amount of data. Advance technologies like image recognition, recognizing words from the sound which are still in the testing stage are all subject of deep learning.

Some facts on Machine Learning

At the very beginning I have mentioned that being a new concept, some ideas about Machine Learning are also popular but not completely true. Here I will try to discuss all those lesser-known facts about ML.

Fact 1: It is not complete automated process and human intervention is required

There is a misconception that ML is a 100% automated process, which is not completely true and human intervention is necessary to create and improve algorithms. The system needs context and parameters to operate which again provided by human operators.

Fact 2: Having advance knowledge in Mathematics is not a prerequisite for simple application of Machine Learning

You can start the application of ML to analyze your data with some practice and guidance. There are lots of content available on the internet some of them are free whereas few are premium courses.

To start practising with ML you can choose any of the free courses. The main factor is you have to practice a lot. I can suggest you a free crash course on ML by Google Developers, developed by Google, so no question about the quality.

The MOOC’s course on ML in Coursera is also very good to start your learning session.

Fact 3: Machine Learning and Artificial Intelligence are not the same

Some people have this notion that these two are same, even I used to have the same idea until I came across this article published in Forbes. It was a very good comprehensive discussion about the differences between these two, read it you will get your many doubts about ML and AI cleared.

Fact 4: Even without a very sound knowledge of programming language you can learn the application of ML

Oh… it certainly helps, having good knowledge in a few programming languages can help you jump start your carrier in ML, but it is not at all an essential one. Its just you have to give some little more time when you are first time writing your code for Machine Learning. Be it R or Python or any other language, you learn it by making errors, this is the most effective way of learning any language.

So, in nutshell, if you are interested in learning ML, just start it now, take a small dataset, write a small piece of code. There will be errors in the beginning, don’t let it hold you back. Soon you will start enjoying its beauty and it will get more and more interesting.