Vertex AI Vizier hyperparameter tuning
In this lesson, we focus on hyperparameter tuning using Vertex AI Vizier. As discussed earlier, although machine learning models automatically learn from data,
they still require user-defined knobs to guide the learning process.
These knobs, commonly known as hyperparameters, control the trade-off between training accuracy and generalizability.
Examples of hyperparameters are the optimizer, epochs, regularization parameters, the number of hidden layers in a deep neural network and their sizes.
Setting hyperparameters to their optimal values for a given data set can make a huge difference in model quality.
Previously, we identified two hyperparameter-tuning methods: grid search and random search.
Grid search is a very traditional technique for hyperparameter tuning.
In the grid search method, we can set up a grid of specific model hyperparameters and then train and test our problem statement model on every combination of values.
In random search, we set up a grid of specific model hyperparameter values, the same as with grid search, but here we select the combination of hyperparameter values randomly.
So a random search tuning technique is faster than a grid search. But a grid search is more effective than a random search because a random search misses a few combinations.
Both grid search and random search are time-consuming techniques because they roam the full space of available parameter values in an isolated way, without paying attention to past results.
And neither grid, nor random search, uses prior information from the past experiments to select the next set of hyperparameter value combinations.
Bayesian optimization is another method of hyperparameter tuning that takes into account past evaluations when choosing which hyperparameter set to evaluate next.
This approach typically requires fewer iterations to get the optimal set of hyperparameter values, most notably because it disregards those areas of the parameter space that it believes won’t produce useful results.
This, in turn, limits the number of times a model needs to be trained for validation because only those settings that are expected to generate a higher validation score are passed through for evaluation.
Vertex Vizier offers
grid search
random search
and Bayesian optimization.
If you do not specify an algorithm, Vizier uses the default algorithm, which applies Bayesian optimization to arrive at the optimal solution, with the more effective search over the parameter space.