MLOps Building Blocks: Chapter 6 - MLflow on EC2
In our previous post we saw How to leverage MLflow on Local and use S3 as storage.
But what we also need to make sure is, such a powerful tool should be scalable and accessible to the entire team or Organisation if needed.
Tracking metrics of the heavy duty training performed by neural networks after extensive development by engineers is of utmost importance.
The advantage of using this framework is that it permits tracking across users, teams and organisations, giving them the leverage to understand the learnings attained from previous trainings and use them to develop better performing models over time.
Today we will go through the necessary steps needed to setup MLflow tracking server on AWS EC2 with minimal efforts.
Step 0: Setup your AWS Account
If you do not have a User Account on AWS kindly create one, as you would need it to follow through the steps. For all the users who already have an account, kindly login into your account.
You can Sign Up or Login using the link below -

Step 1: Setup your EC2 Instance
In this step we setup an EC2 Instance, which is basically a Virtual Machine on AWS Cloud.
As we do not want to incur any costs, so we will use the instance in the Free Tier i.e t2.micro.
In addition to that, for now we will select default VPC and Subnet. One thing to keep note of is while configuring the Security groups we need to add HTTP and SSH connection to the instance.
You can either select an existing Security Group if you have one that allows all traffic for SSH and HTTP type of protocol, or else you can create a new one.
Now as we want this instance to be accessible over the internet to everyone, so don't forget to enable Public IP for your instance

Step 2: SSH into your instance
There are multiple ways by which you can SSH into your Instance.
- Using the PEM file from your local
- Using SSH Connect
- Or setting up boot up script in User Data of your instance during its startup
In our case as we have already launched our instance, we shall SSH into the instance and install the necessary packages in the instance.
As we intend on using MLflow for Model Tracking and there could be a lot of people using it, so in order to handle the traffic and make sure it is scalable, there are a couple of packages that we have to install.
sudo yum install python3.7
sudo yum install httpd-tools
sudo yum install nginx
sudo pip3 install mlflow
Once you have installed all the above mentioned packages, all that's left is to configure your Nginx and setup User credentials
In order to SSH follow the steps as shown below -

Step 3: Configure Nginx
First we add a password for the testuser to access the Home Page or Dashboard as some might to call it.
sudo htpasswd -c /etc/nginx/.htpasswd testuser
Next we configure Nginx to reverse proxy to port 5000
sudo nano /etc/nginx/nginx.conf
As the final steps we need to alter the config file by adding the following to it as mentioned below -
location / {
proxy_pass http://localhost:5000/;
auth_basic “Restricted Content”;
auth_basic_user_file /etc/nginx/.htpasswd;
}
A sample output of the complete Config file of Nginx.

Step 4: Let the Games Begin...
Now that everything is configured all that we have to do is start the Nginx and MLflow server.
To do so, run the following command in your terminal where you have SSH into your instance.
sudo service nginx start
mlflow server --host 0.0.0.0
To use the above setup in your python training code add the following to your training script
tracking_uri = "User provides the DNS of the Configured EC2 instance in the step above"
# Sample tracking_uri = http://testuser:test@PUBLIC_DNS_OF_YOUR_EC2
# Set the Tracking URI
mlflow.set_tracking_uri(tracking_uri)
client = mlflow.tracking.MlflowClient(tracking_uri=tracking_uri)
Now the rest of the steps to use MLflow in your script remains the same as mentioned in our previous Chapters. For detailed view follow the link below -

Once you are all set run your training scripts and visualise your tracking on your Public DNS. It should look something similar to as shown below -

Conclusion
Congratulations! If you would have followed the steps mentioned above, you should have a working scalable MLflow Tracking server, that you can share with your team mates or even your organisation.
I hope this article finds you well and helps you streamline your AI model development.
You can find the complete code base in the link mentioned below -
In our future articles we will be exploring more models and How to Serve them.
STAY TUNED 😁