Google Document AI for Document Processing: An Info-rich Guide

Date: Sept. 29, 2021

The data you work with continues to grow every day. 

Most of this data is unstructured and scattered across documents, emails, files, messages, invoices, and so on. Since these don't come in machine-readable formats, businesses usually resort to time-consuming manual document processing to gain insights from the data. With 80% of enterprise data being unstructured, document processing has become an expensive affair for companies of all sizes. 

This has led to the birth of AI-enabled production-capable document processing systems that can extract, analyse, and store data from differently formatted documents. Such intelligent document processing solutions boost operational efficiency and enable faster decision-making while staying compliant with rules and regulations. 

Google Document AI (DocAI) is a revolutionary intelligent document processing system that streamlines document workflows besides automating and validating documents. If you want to know more about how to use Google DocAI for document processing, this blog is for you.

What is Google Document AI (DocAI)?

Google Document AI (DocAI) is Google’s flagship solution for automating data capture and scaling data processing. It combines the power of natural language processing (NLP) and computer vision, deep-learning neural network algorithms, and optical character recognition (OCR) to accurately capture, extract, and store data. 

If you are looking to build data pipelines that can automatically upload, process, and store structured data, Google DocAI should be your go-to choice. You can also integrate it with existing systems and workflows, and deploy the entire solution in weeks. The best part is that Google Document AI recognises 200+ languages and 50+ hand-written languages. Unleashing the value of unstructured document data is much easier when you adopt an intelligent document processing system like Google DocAI.

Google Document AI comes with a number of processors for classifying, extracting, and enriching data. Some of the most important ones are: 

General Processors

  • Optical Character Recognition (OCR)
  • Form Parser 
  • Document Splitter

Lending DocAI Processors

  • W2 Parser
  • W9 Parser
  • Lending Processors (1003, 1040, 1040 Schedule C, 1099-MISC, 1099-DIV, 1099-INT, 1099-G, bank statement, payslip)
  • Lending Document Splitter and Classifier

Procurement DocAI Processors

  • Procurement Document Splitter
  • Invoice and Expense Processor

The Google Document AI API comes with a structure that makes it easy for you to classify content, extract entities, run advanced searching, and more. It converts unstructured data into digestible data that can be easily analysed. 

The key elements of the Document AI solution are: 

  • Document Repository: Stores documents (scanned, digital) in Google Cloud Storage (GCS) buckets.
     
  • Entity Extraction and OCR: Performs data extraction, document classification, and sentiment analysis with pre-trained and custom-trained machine learning (ML) or artificial intelligence (AI) models.
     
  • Data Warehouse: Stores structured data and derives insights from data with built-in data visualisation supported by BigQuery.

What Are the Benefits of the Google Document AI Platform?

The benefits of using the Google Document AI platform are limitless when it comes to improving customer experiences and unlocking business value with intelligent data processing. Here are some of the key ways in which your organisation can benefit from deploying Google Document AI.

  • Faster Decision-making: Efficient data extraction improves operational efficiency and offers data-driven insights for quick business decisions. 
     
  • Streamlined Workflows: Adopting intelligent data processing streamlines both document and compliance workflows while keeping data accurate.
     
  • Meet Customer Expectations: Google’s state-of-the-art AI makes it possible for you to have useful insights crucial for improving advocacy, lifetime value, and customer satisfaction (CSAT).
     
  • Unified Console: Google DocAI is a unified console that comes with parsers and tools you need to process documents. Not only can you automate and validate documents but also streamline the entire workflow from this console.
     
  • Access Enriched Data: Google DocAI enriches the extracted data with Google Knowledge Graph, meaning all the details on your documents are checked against information available on the internet.
     
  • Provision for Human Review: With human-in-the-loop (HITL) AI, businesses can easily review document processing manually and incorporate them using purpose-built tools. 

If you want to get a hands-on understanding of how Google DocAI works, visit this demo section to understand how it extracts structured data from different types of documents. The output will be available for download and analysis in JSON format.

Google Document AI Workflow: How Does It Work?

In the process of handling unstructured documents, Google Document AI ingests, extracts, stores structured data which would otherwise be left in different business data files. It also heavily leverages artificial intelligence (AI) and Machine Learning (ML) during this process. Here’s a brief overview of what happens during the document extraction process:

  • Data Ingestion: Data ingestion involves collecting, transporting, and storing data from different sources to a storage system. Typically, the data is stored in a database or data warehouse. Documents may include spreadsheets, files attached to emails, invoices, and so on.
     
  • Data Classification: At the end of the data ingestion process, Google Document AI categorises this data using trained natural language processing (NLP) technology. There are a couple of approaches that you can adopt for data classification:
    • Supervised: A set of tags are created to make it easier for machine learning (ML) models to make predictions. For example, these tags can be price, name, email, etc. Having more tags gives a model the confidence to accurately classify the data.

    • Unsupervised: In this method, a classifier is assigned to group similar words or phrases together. For example, technical details of an item in an invoice can be grouped in the same cluster. 

    • Rule-based: Rules-based data classification happens when models categorise data based on linguistic rules such as morphology, syntax, semantics, phonology, and lexis. 

  • Data Extraction: At this stage, pre-trained models extract data from unstructured documents using a combination of computer vision, optical character recognition (OCR), and natural language processing (NLP). 

This structured data, once extracted, is usually stored in a data warehouse such as Google BigQuery. The best part of using Google Document AI is that you can derive insights from these structured datasets. Booth BigQuery’s built-in support and pre-trained ML/AI models help you to discover insights from this data.

How to Use Google Document AI to Extract Data?

If you are just getting started with Google Document AI, you will have plenty of questions like how to enable the API, how to authenticate API requests, how to install client library for Node.js or python, how to create a processor, how to extract the form key/value pairs, and so on. In this section, we take you through a detailed step-by-step guide to help you understand how to use Google Document AI for extracting data from unstructured documents. 

To get started with Google Document AI, you’ll need:

  • A Google Cloud Project
  • A Browser, such as Chrome or Firefox
  • Knowledge of Node.js

To enable the Cloud Document AI API, you’ll need to through the self-paced environment setup. Here are the steps to follow during this time:

  • Create a project after signing into Cloud Console and enable billing
  • Activate Cloud Shell and confirm authentication
  • Enable the Cloud Document AI API

The next step involves creating a document processor. You can either do it by selecting a general processor like Form Parser or a specialised one like a W9 parser. Once you do so, you can upload documents to the console. 

During the next stage, you will have to authenticate API requests using a service account. This service account can be created using the Cloud SDK. Once that’s done, you can use the Google Client library available in your preferred language to make API requests. 

Now is the time to make a synchronous process document request using the synchronous endpoint. You can also use the asynchronous API to process large amounts of documents. Once you run the code, You’ll get to see the extracted text in the console and can store the same in the database. 

The next step involves extracting key-value pairs from the form and corresponding confidence scores. You will find the form fields and their locations in the document response object. That’s how you can easily extract data from handwritten documents or even printed ones. 

How to Combine Google DocAI With Other Google Cloud Products?

Several Google Cloud products are designed to perform specific text and analysis functions. Since they share functionality with Google Document AI, you may find them helpful spending on the functionality you are looking for. 

Image to Text Conversion

  • Cloud Vision API
  • AutoML Vision Object Detection

Document Classification

  • Natural Language API
  • AutoML Natural Language Classification
  • AutoML Vision Classification    

Entity Analysis and Extraction

  • Natural Language API
  • AutoML Natural Language Entity Extraction

Lending DocAI and Procurement DocAI: Specialised DocumentAI Solutions

Lending DocAI and Procurement DocAI are two specialised Google DocumentAI solutions designed for mortgage document processing and procurement data capture automation respectively.
 

Lending DocAI

Manual income and asset document processing is often time-consuming and makes loan application processes lengthier. Lending DocAI is designed to automate these documents using a set of specialised models. Plus, it also has the capability to automate some of the most complex routine document reviews, making it easier for you to make strategic decisions. 

To seamlessly automate mortgage workflows, Lending DocAI uses different parsers (1003 parser, 1040 parser, 1099 parser, bank statement parser, payslip parser, W2 parser, W9 parser, LDAI splitter, and Document AI API). 

Using Lending DocAI can help you to automate data capture as well as quicken the entire home loan process. Here are some of the benefits that you can accomplish with Lending DocAI:

  • Faster mortgage workflow
  • Enhanced compliance with data transparency 
  • Improved user experience across the mortgage life cycle
  • Increased operational efficiency throughout the loan process
  • Improved document accuracy for tax statements and asset documents

Procurement DocAI

The procurement cycle is one of the high-value business processes at many organisations. With Procurement DocAI, you can easily convert unstructured data into structured data and influence business decisions. 

To extract data from a variety of document formats, Procurement DocAi uses a number of parsers and APIs including invoice parser, expense parser, utility parser, PDAI splitter and classifier, and Document AI API. 

With Procurement DocAI, you can:

  • Lower the cost of procure-to-pay processing up to 60%
  • Enrich extracted data with the Google Knowledge Graph 
  • Boost customer experience with a smarter procurement process

Frequently Asked Google Document AI Questions & Answers

1) Is Google DocAI free?

Google Document AI is to be used with other Google cloud products. That’s why you should review the pricing for Cloud Vision, Cloud Natural Language API, or AutoML Natural Language.

2) What does Google DocAI do?

Google Document AI or Google DocAI helps you to scale data extraction from unstructured documents. You can also integrate it with existing systems to get better insights and automate the entire document workflow. 

3) Can you integrate existing systems with Google DocAI?

Yes, Google DocAI can be integrated with your existing system to automate the document processing workflow. If you feel it is getting too technical, you should definitely seek help from qualified IT consultants. NeeVista is a leading IT consulting firm that helps businesses to automate document processing with custom integrations.


Need to talk? Contact us.
Please fill out this field.

Post your comment

Required for comment verification
author-photo

Gaurav Sarin

Author

Gaurav is the Director and Principal Consultant at NeeVista. He helps enterprises leverage the power of data, digital technology, and automation.