MLOps Building Blocks: Chapter 4 - MLflow

MLOps Sep 15, 2021

In our previous posts we explored How to package a Trained model and serve it as a Rest End point.

Well one thing we all know is, there is no Shortcut to Success

So before we even package a solution, don't you think it would be a good idea to Learn how to Train a model.

Well that's an easy one. For all those who haven't read the previous parts, refer the post below to create a Custom Image Classification Model. Which we will be using through this series.

Custom Vision: Chapter 3 - Image Classification - Tensorflow
In our previous posts we explored the Roadmap of Custom Vision and covered some of the Pitstops on our way to this Chapter. So What is this Chapter about? Well we know the importance of Custom Vision to the Industry. We are also aware of the various hurdles one has

So well now you know how to Train. But the question still stands -

Do we get the best results at the first shot?

The answer is probably not. So then how do you keep a track of your Model Training and its corresponding configuration and parameters that you used.

I know it sounds overwhelming, but thanks to the Open Source Community, they gifted us with MLflow.

Before we even answer the question of How do we track? Let's ask ourselves Why do we need to track?

Why do we need to track?

As a developer before we commence training of Custom Models there are loads of parameters for a Neural Model that needs to be configured.

What becomes even more challenging is keeping a track of these configurations as we keep increasing our iterations.

Not just tracking, but also storing the information, artifacts and Metrics history becomes extremely tiring when there is a large group of researchers involved.

As each one of them would be testing and training the model according to their own calculations and thought processes. How do we make sure that the same configurations aren't re-used, so as to have the most optimised approach to achieving State of the Art results.

In order to overcome all the above mentioned challenges one needs to have a framework in place to manage and cater to all varieties of Tracking, Storing and providing Visualisations of the Metrics.

This is where MLflow comes to our aid.

What is MLflow?

MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. MLflow currently offers four components:

  1. Model Tracking
  2. Model Projects
  3. Model Models
  4. Model Registry

If the definition doesn't make you GAGA, Wait till you see how easy it is to setup.

How to Track using MLflow?

Well MLflow is just like any other Python Package. So all you need is a Code Editor and Terminal to get started.

Step 1: Install necessary Packages

pip3 install mlflow

And Voila there you go, you have MLflow installed in your system.

Step 2: Put it to use

As this is our first time with MLflow, let's not do anything fancy. Let's go ahead and create a mlflow_hello_world.py script in our editor.

Populate it with the code mentioned below and execute it.

import os
from random import random, randint
from mlflow import log_metric, log_param, log_artifacts

if __name__ == "__main__":
    # Log a parameter (key-value pair)
    log_param("param1", randint(0, 100))

    # Log a metric; metrics can be updated throughout the run
    log_metric("foo", random())
    log_metric("foo", random() + 1)
    log_metric("foo", random() + 2)

    # Log an artifact (output file)
    if not os.path.exists("outputs"):
        os.makedirs("outputs")
    with open("outputs/test.txt", "w") as f:
        f.write("hello world!")
    log_artifacts("outputs")

The Code should run without any errors or warnings.

Step 3: Visualise your results

As you could clearly see in the previous step we have logged metrics, parameters and artifacts.

How about we actually now visualise it. To start MLflow run the following command -

mlflow server --host 0.0.0.0

Once executing the above in your terminal, Open a browser of your choice and navigate to http://0.0.0.0:5000

You should be able to see an UI with all the Experiments that you have run till date using MLflow as well as the current one.

Let's take a quick peek at the immense capabilities of MLflow.

As you can see, once you navigate to the experiment, you can see tons of features on the Home Screen such as -

  1. User - who initialised the experiment
  2. Source - what was the file source from where it started
  3. Parameters
  4. Metrics
  5. Artifacts

Conclusion

Congratulations, you have successfully setup your MLflow environment on your local system. Feel free to experiment with the Scripts and the UI to explore the features offered.

For detailed Documentation, check out the link below -

MLflow Documentation — MLflow 1.19.0 documentation

As of now all the informations showcased on the Home Screen and logs of your Experimentation are stored in your local Host.

In our upcoming post, we will be exploring how to implement MLflow in a real world use case of Image Classification and How to Store it on Cloud.

STAY TUNED 😁

Tags

Vaibhav Satpathy

AI Enthusiast and Explorer

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.