NLP solutions — Eduardo Avelar

In this lesson you are introduced to NLP solutions, specifically Contact Center AI (CCAI) and Document AI (Doc AI).

Contact Center Artificial Intelligence (CCAI) is Google’s solution to apply AI in contact centers.

Its goal is to increase the operational efficiency with AI while requiring minimum AI expertise.

This can be done by integrating Contact Center AI with the company’s existing infrastructure.

Let’s look at the key features of Contact Center AI:

As you would expect, it captures what a customer has said.

Speech is converted to text using automatic speech recognition (ASR), also commonly referred to as Speech-to-Text (SST), so that the software can further process that text.

It also understands what the customer has said.

The software uses natural language understanding (NLU) technology to interpret the user’s words and make sense of their intent.

Then it talks.

When an appropriate response has been determined (as a text response),

speech synthesis, also commonly referred to as Text-to-Speech (TTS), is used to convert the text into an audio stream or file,

which the software can play back to the customer.

It personalizes a brand or region.

The software can generate different voices for the audio stream.

For example, you might feel that a female voice best represents the “voice” of your business.

You might also want to customize the voice to more closely reflect the accents in your customers’ regions.

And finally, it combines with other best-in-class technology.

The software provides one-click integration with the top telephony providers such as AT&T, so you can provide a seamless service to your customers.

This is just a taste of what Contact Center AI can offer and the list goes on.

To understand Contact Center AI more clearly, think of it as having three components:

The first is Dialogflow.

Dialogflow can be used to create a virtual agent that talks with your customers.

It uses the technology called natural language understanding (NLU), which also supports Agent Assist.

The second is Agent Assist.

Agent Assist helps the human agent when a virtual agent passes the call on.

It is an AI assistant to support human agents.

It monitors the human-to-human conversation, determines what the customer needs, and provides suggestions to the human agent based on the knowledge database.

And the third one is Insights.

Contact Center AI Insights works like like a data analyst who analyzes conversations to uncover meaning.

It provides the tools that Contact Center management needs to analyze historical conversations in order to see what’s working well and what needs improvement.

With the Combination of these three components,

In addition to Contact Center AI, Google provides other horizontal AI solutions based on NLP technology such as Document AI, or DocAI.

In other words, Document AI is a document understanding solution.

It takes unstructured data, such as emails, images, docs, and PDF files

and provides structures to make the data easier to understand, analyze, and consume.

For example, Document AI can read a driver’s license, learn the fields such as state and name,

and parse that information to a structured data form and save it in a JSON file for easy search, classification, and analytics.

Try this demo yourself, and you’ll have a better idea.

What are the major technologies behind Document AI?

Document AI is built on decades of AI innovation at Google that bring powerful and useful solutions to common challenges.

and pre-trained ML models for high-value, high-volume documents.

How does document AI work?

Let’s look at the general document AI workflow.

First you send your documents for processing.

Document AI processes the documents and returns one or more document objects, which contain the extracted, structured information.

Second, you can either choose or create a processor to process the document.

Deciding the correct processor is the primary step of Document AI.

Third, you can add human reviews to evaluate the accuracy of the output, interpret the results, and solve issues that the processor overlooked.

And last, the results are saved in a database or a data warehouse that analytics can gain further insights for your business.

You can either choose an existing processor that is suitable for your use case

or create a new processor using the Google Cloud Console.

When you choose an existing processor, you can choose either a general or

a specialized processor.

When you create a new processor, you can either create a processor on your own

or use Google’s help.

Document AI creates a prediction endpoint where you can send your documents.

Selecting a particular type of processor has multiple implications.

A general processor is a ready-to-use processor, and it serves as the starting point for many general document processing tasks.

A specialized processor is designed to handle one particular type of document.

For example, you can use it to parse W2 forms or rental contracts.

If you want to parse a document that has a specific custom format, you can build your own custom processors.

Finally, you can ask Google to build a custom processor to meet your specific needs.

For document scope, general processors are designed to meet general document processing needs.

You can provide any document, and the general processor will do its best to extract information from the document.

The remaining three types of processors are designed to handle a specific type of document, such as invoices or W2 forms.

Regarding the quality of the output, general processors are good enough for simple document processing tasks.

If you need better output quality, choose one of the other three processors.

The quality of a custom model depends on the training data and the model training.

Specialized processors and custom processors built with the help of Google provide the highest quality to process your document.

In terms of the work required, both general processors and specialized processors are ready to use.

You can parse your document and extract the information by making a simple REST API call.

If you build a custom processor with Google, you need to prepare your data, and then Google will manage the remaining work.

If you build a custom processor yourself, you must do the maximum amount of work.

Regarding long-term intellectual property (IP) ownership, you own the IP when you choose the build-it-yourself option.

In all other cases, Google owns the IP.

For pricing, a general processor offers the most affordable choice.

Build-it-yourself processors are also price-friendly, although some costs might be associated with the model building.

Specialized processors provide the best possible quality and at the same time they cost the most.

In sum, here are the key benefits for each processor type:

A general processor is ready to use for any general document processing tasks.

A specialized processor provides the highest performance in specific fields such as contract and lending.

A custom processor that you build yourself gives users full control of creating a specific processor to solve a unique problem.

A custom processor built with Google can be a win-win for both users and Google.

It takes advantage of Google’s engineering expertise and the field knowledge of users to create a co-branding new processor.