MLOps Building Blocks: Chapter 6 - MLflow on EC2

AWS Oct 13, 2021

In our previous post we saw How to leverage MLflow on Local and use S3 as storage.

But what we also need to make sure is, such a powerful tool should be scalable and accessible to the entire team or Organisation if needed.

Tracking metrics of the heavy duty training performed by neural networks after extensive development by engineers is of utmost importance.

The advantage of using this framework is that it permits tracking across users, teams and organisations, giving them the leverage to understand the learnings attained from previous trainings and use them to develop better performing models over time.

Today we will go through the necessary steps needed to setup MLflow tracking server on AWS EC2 with minimal efforts.

Step 0: Setup your AWS Account

If you do not have a User Account on AWS kindly create one, as you would need it to follow through the steps. For all the users who already have an account, kindly login into your account.

You can Sign Up or Login using the link below -

Cloud Services - Amazon Web Services (AWS)
Amazon Web Services offers reliable, scalable, and inexpensive cloud computing services. Free to join, pay only for what you use.

Step 1: Setup your EC2 Instance

In this step we setup an EC2 Instance, which is basically a Virtual Machine on AWS Cloud.

As we do not want to incur any costs, so we will use the instance in the Free Tier i.e t2.micro.

In addition to that, for now we will select default VPC and Subnet. One thing to keep note of is while configuring the Security groups we need to add HTTP and SSH connection to the instance.

You can either select an existing Security Group if you have one that allows all traffic for SSH and HTTP type of protocol, or else you can create a new one.

Now as we want this instance to be accessible over the internet to everyone, so don't forget to enable Public IP for your instance

Step 2: SSH into your instance

There are multiple ways by which you can SSH into your Instance.

  1. Using the PEM file from your local
  2. Using SSH Connect
  3. Or setting up boot up script in User Data of your instance during its startup

In our case as we have already launched our instance, we shall SSH into the instance and install the necessary packages in the instance.

As we intend on using MLflow for Model Tracking and there could be a lot of people using it, so in order to handle the traffic and make sure it is scalable, there are a couple of packages that we have to install.

  1. Python
  2. MLflow
  3. httpd-tools
  4. Nginx
sudo yum install python3.7
sudo yum install httpd-tools
sudo yum install nginx
sudo pip3 install mlflow

Once you have installed all the above mentioned packages, all that's left is to configure your Nginx and setup User credentials

In order to SSH follow the steps as shown below -

Step 3: Configure Nginx

First we add a password for the testuser to access the Home Page or Dashboard as some might to call it.

sudo htpasswd -c /etc/nginx/.htpasswd testuser

Next we configure Nginx to reverse proxy to port 5000

sudo nano /etc/nginx/nginx.conf

As the final steps we need to alter the config file by adding the following to it as mentioned below -

location / {
	proxy_pass http://localhost:5000/;
    auth_basic “Restricted Content”;
    auth_basic_user_file /etc/nginx/.htpasswd;
    }
A sample output of the complete Config file of Nginx.

Step 4: Let the Games Begin...

Now that everything is configured all that we have to do is start the Nginx and MLflow server.

To do so, run the following command in your terminal where you have SSH into your instance.

sudo service nginx start
mlflow server --host 0.0.0.0

To use the above setup in your python training code add the following to your training script

tracking_uri = "User provides the DNS of the Configured EC2 instance in the step above"
# Sample tracking_uri = http://testuser:test@PUBLIC_DNS_OF_YOUR_EC2

# Set the Tracking URI
mlflow.set_tracking_uri(tracking_uri)
client = mlflow.tracking.MlflowClient(tracking_uri=tracking_uri)

Now the rest of the steps to use MLflow in your script remains the same as mentioned in our previous Chapters. For detailed view follow the link below -

MLOps Building Blocks: Chapter 5 - Experimenting with MLflow and AWS
In our previous post we learnt how to Setup MLflow on our local system and run experiments as we need. As we want to gradually move towards getting our Pipeline ready for Production, we need to consider integration with Cloud.Let’s take the first step towards integrating with Cloud, In

Once you are all set run your training scripts and visualise your tracking on your Public DNS. It should look something similar to as shown below -

Conclusion

Congratulations! If you would have followed the steps mentioned above, you should have a working scalable MLflow Tracking server, that you can share with your team mates or even your organisation.

I hope this article finds you well and helps you streamline your AI model development.

You can find the complete code base in the link mentioned below -

AI-kosh/mlops/chp_6 at main · Chronicles-of-AI/AI-kosh
Archives of blogs on Chronicles of AI. Contribute to Chronicles-of-AI/AI-kosh development by creating an account on GitHub.
In our future articles we will be exploring more models and How to Serve them.

STAY TUNED 😁

Tags

Vaibhav Satpathy

AI Enthusiast and Explorer

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.