1/29 NLP end-to-end workflow

Vertex AI, Google’s AI Platform, provides developers and data scientists with one unified environment to build custom ML models.

2/29 NLP end-to-end workflow

How so?

Let’s look at how Vertex AI, specifically AutoML, experiments and builds an NLP model step by step.

There are three main stages:

3/29 NLP end-to-end workflow

data preparation,

4/29 NLP end-to-end workflow

model training and

5/29 NLP end-to-end workflow

model serving.

6/29 NLP end-to-end workflow

The first stage is data preparation.

7/29 NLP end-to-end workflow

During this stage, you must first upload data.

The data used in NLP models is text data, which can come from either Cloud Storage or your local machine.

8/29 NLP end-to-end workflow

The data can also be either labeled or unlabeled depending on the goal of the NLP project.

9/29 NLP end-to-end workflow

A label is a training target.

So, if you want an NLP model to identify the sentiment of a sentence,

10/29 NLP end-to-end workflow

you must first provide sample sentences that are tagged or labeled as either positive or negative.

11/29 NLP end-to-end workflow

A label can be manually added, or

12/29 NLP end-to-end workflow

it can be added by using the paid label service of Google through the Vertex console.

These human labelers will manually generate accurate labels for you.

13/29 NLP end-to-end workflow

The text data can also be unlabeled.

For example, If you want to identify the underlying pattern of texts and group similar documents into sets, you can use cluster analysis.

14/29 NLP end-to-end workflow

And if you want to recognize the co-occurrence of pairs of words or phrases you can use Latent Semantic Indexing (LSI).

15/29 NLP end-to-end workflow

After uploading data, you’ll then prepare it for model training with feature engineering.

16/29 NLP end-to-end workflow

The data normally needs to be processed before the model starts being trained.

17/29 NLP end-to-end workflow

The second stage of the NLP workflow is model training.

18/29 NLP end-to-end workflow

Model training includes two steps: model training and model evaluation,

19/29 NLP end-to-end workflow

An NLP model, just like the other ML models, needs a tremendous amount of iterative training.

This is when training and evaluation form a cycle where the NLP model is trained, then evaluated, and trained again.

20/29 NLP end-to-end workflow

The third and final stage is model serving.

21/29 NLP end-to-end workflow

Model serving also includes two steps: model deployment and model monitoring.

22/29 NLP end-to-end workflow

An NLP model needs to be moved into production, otherwise it has no use and remains only a theoretical model.

23/29 NLP end-to-end workflow

There are three options to deploy an NLP model.

24/29 NLP end-to-end workflow

The first option is to deploy to an endpoint.

This option is best when immediate results with low latency are needed, such as real-time translation.

A model must be deployed to an endpoint before that model can be used to serve real-time predictions.

25/29 NLP end-to-end workflow

The second option is to deploy using batch prediction.

This is the best option when no immediate response is required, and accumulated data should be processed with a single request.

For example, sending new ads every other week based on the user’s recent purchasing behavior and what’s currently popular on the market.

26/29 NLP end-to-end workflow

And the final option is to deploy using offline prediction.

This is the best option when the model should be deployed in a specific environment outside the cloud.

27/29 NLP end-to-end workflow

Now it’s important to note that an NLP workflow isn’t linear, it’s iterative.

28/29 NLP end-to-end workflow

For example, during model training, you might need to return to investigate the raw data and generate more useful features to feed the model.

29/29 NLP end-to-end workflow

When monitoring the model during model serving, you might find data drifting, meaning the accuracy of your predictions might suddenly drop.

You might need to check the data sources and adjust the model parameters.

Fortunately, these steps can be automated with machine learning operations, or MLOps.