AI on Production: Part I

MLOps Sep 3, 2021

In the last decade, our world has been surrounded by technologies and smart devices. And amidst these high-tech devices, if we say at least 90% of them have implemented AI in some capacity, then we won't be wrong.

There are many applications of AI which are making our lives simpler and helping us out in some mundane tasks. Some are helping us elevate our standard of living, and some are still evolving.

As a customer, sometimes we are worried about the privacy of our data, but deep down we want to adopt more and more to these technologies.

But this is from the customer's perspective. What about ML developers?

There are majorly two streams in the Machine Learning field -

  1. Research Oriented: Where the developers are concerned about building out a model that solves a problem say identifies security threat by just looking at the video feed, and give state-of-the-art accuracy on Jupyter notebook
  2. Production Oriented: Where the developers are more concerned about serving the model on production, improving the accuracy of the model overtime on the production dataset, and having a complete end-to-end pipeline to monitoring, development, and deployment.
So now you must be thinking, how hard it can be to go from Jupyter Notebook to Production environment?
Let me be straight and honest with you, It's really hard.

Challenges

Let's try to understand various challenges faced in Production.

"The first step to solving a problem is to understand it well."

Shortage of Data

Data is the heart and soul of your AI solution. But it is really difficult to gather data that can be used to solve any ML problem. There are industries where collecting data is not difficult, like e-commerce. But some industries struggle to collect the amount of data needed to build an AI solution.

For example:

An e-commerce website can easily predict and recommend items a user will be interested in based on the data collected over time from his/her profile.

But for a medical industry product, it's really difficult to predict whether the person has a risk of getting a Heart Attack.

Cost of Implementation

Money plays a vital role in solution development. Data gathering and storage can be expensive depending on the scale of data you are dealing with.

AI engines may require a lot of computing power at the training stage. And we are talking about GPU computing here.

Once you have deployed your ML model on production, there will be inferences made on the model by the end-users. And that means more compute, in some cases we need GPUs in the inference process as well.

So overall it's a costly affair.

Transition from POC to Production

According to a survey conducted by Accenture about AI projects of about  80% to 85% of companies are in the proof-of-concept stage.

A proof-Of-Concept is when you prove that your solution works on Jupiter notebook or your local system. And most of us start celebrating after a successful POC.

It's not bad to enjoy your small success. But we need to be aware of the efforts needed to make this POC production-ready. The transition here is very important. A typical AI solution which can be called production-ready involves roughly 20% of ML algorithms and  80% of software engineering parts, which developers need to be aware of.

Modified Data Distribution

Imagine you have put up all the efforts needed to make your solution production-ready. Now a question arises,

Do you think thats enough?
Not quite.

Often it's observed that the model built on a local system performs poorly when exposed to production. Now there are multiple reasons for this observation.

  1. Model was trained on a small subset of data: This is also described as model drift, where the initially trained model does not perform well when the actual production data is fed into it. This case can be solved by incremental training of the model.
  2. Changes in Data distribution: This scenario is also called Data/Concept Drift, where the test data distribution on production is completely changed and this requires a new model to be trained on the new distribution.

Conclusion

That was a lot of information right? Deploying an AI solution on production and maintaining the level of accuracy is not a cake-walk. But with the right approach  and right set of tools, this process can be streamlined to an extent.

Try to ponder upon the topics discussed here. And don't you worry, we wont just leave you with the set of problems. In the next article, we will see how can we find a way out of these challenges and deploy a scalable AI product on production.

Stay tuned. :)

Tags

Arpit Jain

Machine Learning Engineer

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.