NLP with AutoML
In this section, you’ll focus on AutoML
and explore how it automates the ML workflow.
To understand AutoML
, which is short for automated machine learning, let’s briefly look at how it was built.
If you’ve worked with ML models before, you know that training and deploying ML models can be time consuming, because you need to
repeatedly add new data and features,
try different models,
and tune parameters to achieve the best result.
To solve this problem, you have AutoML
.
When AutoML
was first announced in January of 2018, the goal was to automate machine learning pipelines to save data scientists from manual work, such as tuning hyperparameters and comparing against multiple models.
But how does AutoML accomplish this?
Two technologies are vital behind the scenes.
The first is known as transfer learning.
With transfer learning, you build a knowledge base in the field.
You can think of this like gathering lots of books to create a library.
Transfer learning is a powerful technique that allows people with smaller datasets, or less computational power, to achieve state-of-the-art results by taking advantage of pre-trained models that have already been trained on similar, larger datasets.
Because the model learns through transfer learning, it doesn’t have to learn from scratch,
so it can reach higher accuracy with much less data and computation time than models that don’t use transfer learning.
In addition to transfer learning, the second technology is Vertex AI neural architecture search.
The goal of neural architecture search is to find the optimal model by comparing against multiple models.
Think of this like searching the best book in the library to help you learn what you need.
AutoML
is powered by the latest machine-learning technology.
When a new model is trained, AutoML
actually trains it on top of pre-trained models and compares the hyperparameters against multiple models to find the optimal one.
All these happen automatically behind the scenes.
One of the biggest benefits is that it’s a no-code solution.
That means it can train high-quality custom machine learning models with minimal effort and requires little machine learning expertise.
This allows data scientists to focus their time on tasks like defining business problems in NLP or evaluating and improving model results.
Others might find AutoML
useful as a tool to quickly prototype models
and explore new features of a dataset before investing in development.
So, what NLP problems does AutoML solve?
AutoML
supports four types of data: image, tabular, text, and video.
And Text is normally the data type used for NLP
.
For each data type, AutoML
solves different types of problems, called objectives.
For text data, AutoML
solves three major problems:
You can use a classification model to analyze text data and return a list of categories that apply to the text found in the data.
For example, you can classify customer questions and comments to different categories and then redirect them to the corresponding departments.
An entity extraction model can be used to inspect text data for known entities referenced in the data and label those entities in the text.
For example, you can label a social media post in terms of predefined entities such as time, location, and topic.
This can help with online search.
It’s similar to the concept of a hashtag, but created by machine.
And a sentiment analysis model can be used to inspect text data and identify the prevailing emotional opinion within it, especially to determine the comments of a writer as positive, negative, or neutral.
In reality, you might not be restricted to just one data type or one objective.
Instead, you need to combine multiple data types and different objectives to solve a business problem.
For example, for a translation job, you may first need to upload either image or video data and turn them into text, and then translate the text to different languages.
AutoML
is a powerful tool that can help across these different data types and objectives.