Deploy machine learning models: things you should know

Deploy machine learning models

To deploy machine learning(ML) models means to take a machine learning model from development to production. You have built an ML model, validated and tested its performance. But what its use if it is not utilised to solve real-world problems? Deploying a model means making an ML model production-ready. Here in this blog, we will discuss the steps of this process.

The deployment process takes a model out from the laboratory or from the data scientist’s desk and makes its appropriate use. There are lots of model across all sectors of industry and most of them are never in use. Every time develop any model the obvious question I have faced from my peers is “how do you make people use the models you develop?”

Why do we need to deploy a model?

Actually, I think this is the primary question that should appear in the mind of data scientists even before model building. Maurits Kaptein of Tilburg University has rightly pointed out in his article that ”the majority of the models trained in … never make it to practice”. This is so true. In his article, he illustrated how brilliant models are developed in medical research die silently in the notebook only as they never reach other health care professionals to reap the benefit.

This is where the importance of model deployment lies. In the agricultural term, we sometimes coin this process as “Lab to Land” process. That means what technologies scientists develop in the lab should reach the land for practical implementation.

Now in case of ML model development, the life cycle is different from that of software development. In the case of software development, the requirement analysis of the client is an integral part. But the data scientists are less concern about the model’s implementation process and highly focused to build a satisfactory model with high precision.

Lets first discuss a machine learning model’s life cycle to understand why the deployment process is so important.

Model development life cycle

A machine learning model development process has several steps. And the model needs to keep always updated. The model needs to be trained with fresh and relevant data in order to keep it updated. See the below diagram of a machine learning model life cycle. Notice that the last stage fo development of an ML model involves iterative steps of updating the model with new data.

Machine learning model development life cycle
Machine learning model development life cycle

A model once developed and forgot remains no longer relevant to the rapidly changing scenario. Target -feature relationship changes, features evolve also new features get added. So it is a continuous and dynamic process. In an ideal condition, the model development and production team should remain in continuous touch to maintain the updation process.

This is also known as end-to-end model data science workflow. It includes data preparation, exploratory data analysis, data visualization, model training, testing, prediction and performance testing. Once a model performance is satisfactory the model is ready for deployment.

What is model deployment?

Suppose a data scientist has developed a model in his computer using an interactive notebook. Now he wants to encapsulate it in such a way that its prediction or analyzing capability can straightway be utilised by the end-users. To do that the data scientist can adopt a number of ways to deploy his/her project. Let’s discuss them one by one.

Creating libraries and packages

This refers to the process to encapsulate your code in a library. The programming language can be anyone of your choice. An ML model created in R, Python, Scala, etc., a library encapsulates all the functionalities of the ML model. It is ready to use for any other data scientists on their own data.

A library or package created to deploy any data science model needs to be updated at regular intervals. For this purpose, it also has the feature to maintain its version in the repository. This feature helps you to keep track of versions and allows the flexibility to use any particular library version.

Hosted or static notebook

Using Jupyter notebook is the most popular way of creating ML models. It is an interactive IDE which allows you to write code and also data visualization, writing texts all at one place.

Jupyter notebook
Jupyter notebook

When you finished with the model development part, you can use the same notebook to host either in Github or Jupyter nbviewer or Anaconda cloud either as a static notebook or as a rendered notebook service. You need to just take care of the basics of deployment. Other nitty-gritty like security, scalability, compatibility issues are taken care of by the hosting service itself.

You can give version numbers to your notebook so that as it gets updated with new data, tracking the version change is possible. This form of deployment is very attractive for any business analysts ready with an updated business report.

Also, it enables end-users with no access to any kind of resources either data or computing to take the benefit of data exploration and visualization. On the other hand, the trade-off is being static report it limits the interaction and poor real-time experience.

Use of REST APIs

This is another way of deploying your machine learning models. In this case, a data scientist once done with model building task deploy it as REST (Representational State Transfer) API. And then other production engineers provide the required layers of visualization, dashboards or web applications. Then the end-users make use of the machine learning model from the REST API endpoints.

Deploy machine learning models using REST API
Deploy machine learning models using REST API

An ideal example of use of such APIs is ML models built in Python language. Python has the full and exhaustive set of modules which can take care of all the steps starting from data handling, model building to model deployment.

It has data handling libraries like Pandas, NumPy, model building libraries like Scikit-Learn, Tensor, Keras etc. And then a vast range of frameworks like Django, Flask etc. to deploy the model built. So learning a single language can make you self sufficient from building a model to its deployment.

Interactive applications

This is also a popular form to deploy machine learning models. It provides the endusers an easy interactive interface to explore, analyze, try any of the ML algorithms and to export the data for further analysis.

Interactive applications does have a server side component. Thus it is not static like hosted or static notebooks. It allows its users to interact with the data and a real time data handling experience.

For example, Bokeh application is such a powerful interactive application. Users can play with the data with the number of functionalities provided in the interface like sliders, drop-down menus, text fields etc.


It is a very popular form of production technique where the user can perform exploratory analysis and understand the deployed project. Here at a time a large number of users can take part to explore the result.

Jupyter notebook is as of now the most preferred way to deploy ML projects as dashboards. Like its interactive notebook, the dashboard has also components for interactively designing the report lay-outs. You can control it to make grid-like or report-like formats.

Issues in deploying machine learning models

So a model needs to get deployed as an interactive application. But many a time it has been observed that the deployment part takes months together to become fully functional. And the problem is that after such a gap the ML model developed gets obsolete. The data it is trained with needs to be updated as well as the training process.

It becomes more of a problem as the data scientist handovers the model to the engineers involved in the deployment. So changes in the model require again involving the data scientists which are not always possible. Even in case, it has been deployed already, the model needs to update time to time. So, the development team and production team need to work in unison.

The gravity of the problem can be easily understood if we consider a practical case of application of Machine Learning models. Lets take the example where credit card companies uses predictive modeling technique to detect fraudulent credit card transactions.

Suppose we have developed an ML model which predicts the probability of a credit card transaction as a fraudulent transaction. Now the model needs to deliver the result the moment credit card transaction happens in realtime.

If the model takes time longer than 5 minutes then what is the use of such a model? The credit card company needs to make a decision the moment a fraud transaction is taking place and flag it. Prediction accuracy is also of utmost importance. If it predicts a fraud with an accuracy of less than 50%, then it is no way more efficient than tossing a coin.

Serverless deployment

So what is the solution? How can be a model kept always updated? A model which is based on old data and not accurate has no industrial value. So, serverless deployment can be a good solution to overcome the issues mentioned above. Serverless deployment is like the next level of cloud computing.

All the intricacies of the deployment process are taken care of by the deployment platform. The platform completely separates the server and application part in case of the serverless deployment process. Thus, data scientists can pay full attention to the development of efficient machine learning models.

Here is a very good article on the serverless deployment of data science model. To apply the process successfully you need to have some knowledge of cloud computing, cloud function and obviously machine learning.

Types to deploy machine learning models

Suppose a product manager of any company has found out a solution for customer centric problem in his business and it involves the use of machine learning. So, he contacts data scientists to develop the machine learning algorithm part of the total production process.

But a machine learning model life cycle and a software development life cycle differes. Most of the cases the developers of the model have no clue how the model can ultimately be taken to production stage. So the product manager needs to clearly state his requirement to the whole team to meet the end goal.

Now the deployment of a machine learning model majorly depends on the end-user type. How quickly the client needs the prediction result and the interface. Depending on these criteria the product manager needs to decide how the final product should look like. Let’s take a few examples of such real-world applications of machine learning deployment cases.

Example 1: ML model result gets stored in database only

This is a situation where the client has some knowledge of SQL and can fetch the required data from the database. So here if the production manager can only store the ML output in a designated database and his task is complete.

Use of lead scoring model can be a good example of such a situation. Lead scoring is generally a technique followed by marketing and sales companies. They are interested to know the market interest in their products. There are different parameters which indicate market readiness of their product.

A lead scoring model analyses all these parameters like the no. of visits of the product page, lead generation, checking the price, no. of clicks etc. to predict the lead score. Finally the lead score gets stored in a database and revised on daily basis or as per the client’s requirement.

Example 2: the data needs to be displayed on the interface

Here the situation is the marketing executive does not know SQL and unable to fetch the required data from the database. In this case, the product manager needs to instruct his engineers to go one step further than the earlier one. They now need to display the data through Customer Relationship Management (CRM) platform. Which needs to Extract-Transform-Load operations to integrate the data from different sources.

Example3: interactive interface

In this case the user interface is interactive. The ML model result operates on the end-user’s input and returns required result. This can be web application or mobile apps. For example, several decision support systems are there where users input their particular condition and the application guide them with proper recommendations.

Mobile apps like Plantix (see the below clip from the developers) helps users to know the plant disease accurately. The user needs to click pictures of the disease affected part of the plant and the image recognition algorithm of the app determines the disease from its already stored image libraries. Additionally, this app helps the user with proper guidance to get rid of the problem.


Any good Machine Learning model if not deployed has not practical use. And this is the most common case across the industry. Efficient models are developed but they never see the day light and remains forever in the notebook. This is mainly because lack of harmony between the development phase and production phase and some technical constrains like:

Portability of the model

The data science models developed in a local computer works fine until it changes the place of execution. The local computer environment is ideally set for model execution and the data. So to make it deployable either the new data has to reach to the model or the model has to reach the data. From a portability point of view, the latter option is more feasible.

Data refinement

During model development the data scientists procure data from multiple sources and preprocess them. Thus the raw data takes good shape and in ideal form to feed the ML algorithms. But as soon as it goes to production phase, it has to deal with the client’s raw data without any filtering and processing. So model’s robustness is a concern for their deployment.

The latency period

While the model is in the development phase, it has to deal with huge data set. Model training, validation, testing and finally prediction, quite obvious the time takes in this process is long enough. But while in production the prediction process may take a few example case and deliver the prediction. So the expected latency is far less. Also, the industry’s requirement is a real-time prediction most of the time.

So, a data scientist needs to take all the above factors into account while model development. There are several approaches like using containers, using good memory management practice, server less deployment which help to overcome the technical constrains to a great extent. However ML model development and deployment is a continuous process and refinement, training of the model goes on even after a successful deployment.

I have tried to present all the aspects while deploying machine learning models. It is a vase topic and a single article is not enough to discuss it in detail. Still I have covered most of the important factors briefly here so that you can have a basic idea at one place.

In coming articles I will try to cover details with in-depth articles taking one aspects at a time. Please let me know your opinion about the article, any queries regarding the topic by commenting below. Also mention if there is any particular topic you want to read next.