Document AI - RESTful implementation using Python

Google Cloud Platform Oct 5, 2021

In our previous post, we looked at Document AI and its use cases. We did a test run with a sample document (in PDF format) from the Google Cloud Platform console.

Scenario (back to our story)

Leonard was happy with the test results from sample documents and the management at Payroll Processing LLC. They have an explicit requirement that their existing systems should have zero downtime.

Solution - Micro-services architecture 🏛️

Our friends at have written extensively about Micro-services. Do check it out, link below.
Golang. Cloud. Backend. Right here!

This architecture allows the application to be loosely coupled and can be developed as a separate project.

Payroll Processing LLC's entire software stack is written in C# / .NET framework. GCP's Document AI has a client library for the languages - Java, Nodejs & Python (at the time of writing this article). It means that even if they wanted to, they can't integrate Document AI within their current application.
Quickstart: Using client libraries | Cloud Document AI Documentation
Client Libraries that let you get started programmatically with Document AI in java,nodejs,python.

We will build a Python-based microservice for handling requests for calling Document AI services on GCP via RESTful API.

Why Python? 🐍

Python has been the de facto language of choice when working with Data & Machine learning types of problems. Python ecosystem has some of the best tools for data manipulation. Such as PyData Stack, namely Pandas for manipulating structured data (data which resembles excel/spreadsheet).

Implementation 🛠️

Sync mode
Sync mode is essential for doing quick tests and demos for small files. The result will be available within the same call.
Async mode
Async mode is rather elaborate as it requires multiple components such as Cloud Storage and Cloud Functions.

After the RESTful service hands of the job to Document AI, the request ends. However, the job continues asynchronously on the GCP and, this is abstracted from the developers.

Once the document is processed & the output is ready, it is saved on the cloud storage bucket (bucket details is to be configured beforehand).

We configure a cloud function to listen on the cloud storage bucket. And whenever a new file is added, the cloud function sends a call back to our RESTful API (hosted on Cloud Run).

Finally, RESTful API can save the response in a database like PostgreSQL/MySQL (hosted on Cloud SQL) and then call another downstream system to let them know that their document is processed.

Essential GCP Components - ☁️

  1. Cloud Run for hosting the RESTful API in managed Kubernetes serverless instance.
  2. Cloud SQL for a managed and highly available PostgreSQL database on GCP.
  3. Cloud Storage for storing documents while upload to Document AI & to save output responses from Document AI in async or large file processing mode.
  4. Cloud Functions for implementing a callback mechanism to let our RESTful API service know that the processing is complete.
  5. Google Container Registry (GCR) for storing container images of our RESTful service, Cloud Run will pull the image from GCR.

Other salient requirements

  1. Define API Contracts - it is a best practice to define the API contracts before hand so that your can cover all corner cases of your application.
  2. Packaging & Scalability - to enable Cloud to run or any containerized orchestration tool, it helps if the application is packaged & isolated from the get-go.
  3. Security - Google Container Registry comes with a vulnerability scanning tool that can detect and suggest solutions for potential vulnerabilities found in our application.
  4. Test-Driven Development (TDD) - TDD is essential for any project/product's success in the longer term. Good test cases help to ensure that the application is bug-free and consistent when new features are added.
  5. Documentation - A good documentation etiquette goes a long way for the development team and integrations. It also makes it easier to onboard new devs and helps reduce the time spent on KT sessions (knowledge transfer).

Closing thoughts 🤔

This article tries to give you the overall picture and the main implementation details. However, if you want to dig deeper and wish to take a closer look, we highly recommend that you check out our repository for this project.

GitHub - Chronicles-of-AI/gcp-docai-pyservice: Python based RESTful API written in FastAPI for GCP Document AI
Python based RESTful API written in FastAPI for GCP Document AI - GitHub - Chronicles-of-AI/gcp-docai-pyservice: Python based RESTful API written in FastAPI for GCP Document AI

We strive to make the content better for you and with every article, there is a significant amount of R&D 💭 effort which goes in. Your feedback means a lot and keeps us on our toes to deliver better content. Let us know your thoughts, share some love 💓 - please like 👍 and share! 🤝

Until next time! 😎


Nikhil Akki

Full Stack AI Tinkerer

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.