Document AI: A gentle introduction

Google Cloud Platform Sep 24, 2021

Ever wondered how you could automate document processing? Want something more than OCR?


Leonard is an aspiring IT engineer and, Payroll Processing LLC has hired him. The firm processes large volumes of Payroll documents which are in most instances in a printed format. The task of extracting values from the documents is a manual effort. It takes an average of 7 minutes to complete one document batch. At this rate, a worker can process no more than 60-70 documents per day. Leonard is assigned a task to automate this process and reduce the time taken by each worker by at least 50 per cent, thereby improving the productivity of the business process.

Let's look at the use cases

  • Data digitization - There are tons of businesses that rely on printed documents to conduct day to day business. The data present in such documents can help get an edge by applying machine learning and other AI techniques. Doc AI facilitates this transaction of converting data from printed format to meaningful digital record.
  • Document process automation - Businesses that rely on only manual document processing is slow and has high operational costs. Doc AI enables automation in Document processing. Human intervention can be used when needed.
  • Data Validation - One of the main tasks of any document processing system is to apply rules and validate if the data present in the document is as per business requirements. DocAI supports many document types out of the box, where these documents can be pre-validated and processed to get higher accuracy.
  • Decision making - Once the documents are digitized, the data points can be leveraged to come up with Decision-making models and other business logic to do further automation.
  • Lower operational costs - It is apparent that computers are good at doing repetitive tasks with higher accuracy and consistency. Doc AI helps save time and resources, which in turn leads to lower costs.


Document AI is built on decades of AI innovation done at Google.

Doc AI leverages Google’s industry-leading technologies: computer vision (including OCR) and natural language processing (NLP)

Doc AI has already processed tens of billions of pages of documents across lending, insurance, government and other industries and is continuing to help Enterprise's across the globe in accelerating their Document Processing and taking a huge leap into the world of Digitization.

Doc AI has the capability to digitize your Document and extrapolate the output in structured format such as -

  1. Form Data
  2. Tabular Data
  3. Excel Data

Now to continue with our story -

Leonard has evaluated the benefits and the task at hand for his current assignment. He now wishes to test out Doc AI's capabilities. Let us see how he goes about doing the capabilities test.

Step 0: Setup your GCP Account

For all the folks who have this set, feel free to skip this step. For all the readers who are doing this for the first time, kindly follow the link below -

GCP - Vertex AI Setup for Devs
One of the biggest challenges for any Developer is scrolling through the massive expanse of detailed Documentation offered by a Company to setup their Product on your system. The same is the issue with Google Cloud Platform.Sometimes providing an extensive documentation can make a beginner’s life ve…

Step 1: Upload a Document and Enjoy the show

Follow the steps as demonstrated below -

  1. Create a Document Processor in DocumentAI
  2. Choose the Type of Model (Specialised Models) for Document Extraction
  3. Provide the name and Enjoy


Doc AI is a SaaS product available on the Google Cloud Platform that enables business stakeholders to automate, optimize and digitize data present in printed documents. Since it does more than just OCR (Optical Character Recognition), some define it as IOCR (Intelligent OCR). It helps reduce a lot of manual effort, thereby saving time and money. There are complimentary services that GCP provides, which works well with DocAI -

Related Google Cloud products | Cloud Document AI Documentation

To learn more about how NLP methods can be used with text data, visit our section on NLP for everybody.

Leonard is happy with what he sees and looks like he has the right tool for the job. Now he has to figure out a way to enable existing systems to use Doc AI.

In the next post, we will build a Python-based RESTful API service to enable interaction with Doc AI for Document processing by other programs and applications.

So stay tuned!


Nikhil Akki

Along with Vaibhav Satpathy

Full Stack AI Tinkerer

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.