MLOps Building Blocks: Chapter 7 - Local Serving with FastAPI - Text Classification
In our previous articles we explored How to Serve Image Classification type of models Locally.
But well as we all are aware, there are multiple types of models and data types when it comes to interacting with the world.
So today's article is about Serving a Custom Text Classification model Locally and running Inference over it.
So what do we need?
As always to run inference, you would need a training script. Now instead of re-inventing the wheel, let's follow one of our previous Blogs on the same content, where we create a Custom Text Classification model.

Now that we are all set with our model, let's evaluate our Status Quo first

Things we can Upgrade
- In general, Models are usually stored on a Cloud Storage for easier access.
- In some cases the models are even stored within the Docker Application that has been built to be used by End User.
- There are many tools available in the Open Source community that help serve models at production level, but the learning curve is usually very High.
- It is very likely that your single application could be serving multiple models with different URL End points.
Our current situation
- We are running our application on our Local system, hence the models are also stored and accessed by the application from our system.
- As we want to go Step by Step, so that things don't feel overwhelming as people usually portray them to be, so we decided to use FastAPI and Local System.
- As this is just a Test Script to help one understand intuitively, we have created only one end point for one model.
Now I think we are good to begin. Let's take a cup of Coffee and begin our Journey.
Step 0: Install all the necessary Packages
pip3 install tensorflow json numpy uvicorn fastapi
Step 1: Import all the Packages
import uvicorn
import numpy as np
import json
from fastapi import FastAPI
from tensorflow.keras.models import load_model

Step 2: Read the Model Artifacts and Labels
# Give local path to your models and labels
model_path = "<path_to_your_exported_model_directory>/models_v2"
labels_path = "<path_to_your_exported_labels>/labels.json"
# Read the labels
with open(labels_path, "r") as f:
labels = json.load(f)
# Load your model and create its instance
model = load_model(model_path)
Step 3: Create a FastAPI Instance
# Declare a FastAPI instance
app = FastAPI()
Step 4: Create your API End Point
As we are creating an API End point that will accept Text as inputs, so we will be going ahead with defining certain query parameters on our API.
You can choose any extension that you might want to give to your API.
Below we have demonstrated a simple way of creating a POST API with text as input. If you want to explore further, you can take a look at the link mentioned below.
# Create a POST type of router with a URL end point name of your choice
# Your POST request carries the text in its query
@app.post("/test_model")
async def test_function(sample_text: str):
# Read the Text and convert it into a numpy array
test_input = np.asarray([sample_text])
# Send your Text for Prediction and receive a numpy array of probabilities
prediction = model.predict(test_input)
# Read the corresponding label from your Labels JSON
index = np.argmax(prediction)
predicted_label = labels.get(str(index))
# Return the predicted Label
return {"Prediction": predicted_label}
# Run the script
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=5000)
Step 5: Run your Script
Once you are all set, all that's left is running the code and opening your choice of browser and navigating to the URL = http://o.0.0.0.8000/docs
2021-10-08 15:12:22.250025: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-10-08 15:12:22.250451: W tensorflow/core/platform/profile_utils/cpu_utils.cc:126] Failed to get CPU frequency: 0 Hz
INFO: Started server process [25912]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
Step 6: Enjoy Inferencing
Now all that's left is to Test our API end point. So let's go ahead and follow the steps as mentioned below.
Feel free to try out your API with various Inputs and validate if your model is performing as expected.

Conclusion
Congratulations! If you were able to follow the steps mentioned above, then you should have a working API End point on your local system.
If you want to take things to the next level, then follow one of the articles of our previous Chapters -

Or if you might prefer going through the article and want to check out the code, you can view the link below -
I hope this article finds you well. In our future articles we will be covering more implementations of AI models End to End for MLOps.
STAY TUNED ๐