Skip to content

Ray.Tune Model Optimization

One of the very important questions in the deep learning is what model is the best. There are two well-known methods to find the best models:

  1. Reinforcement Learning Methods
  2. Search Methods

There are some good resources for RL methods, such as:

  • 2018 – Learning Transferable Architectures for Scalable Image Recognition
  • 2018 – Neural Architecture Optimization
  • 2018 – Progressive neural architecture search

They designed a non-linear cost function (usually with RNN or LSTM) to predict the models’ performance without the need to train them. In this way, they save a huge amount of time and make the method feasible.

On the other hand, the search methods which are usually based on evolutionary algorithms try to narrow the search space with heuristic methods and find the optimum design.

Both methods need some manual modifications to successfully give better performance than human-made models. It does not seem a human can guess the successful architectures because there is no clear interpretation for connections between elements in the network.

But there is an important question that which is more important:

  1. A well-trained network with good performance
  2. A network with winner architecture and good performance

My opinion is that if we can train a network with too many parameters and reach a good performance, it is not important if the architecture is perfect or not. Even the perfect architecture is very controversial. When architecture is evaluated, in the best case, it will be retrained several times and the average performance will be taken as the architecture performance which is not completely true. We know that when the search space is huge, finding the optimum points needs very good luck. So, assigning a performance to a large neural network cannot show its capabilities, because it may be trained in another condition and gives much better results.

Hence, a well-trained network with good performance is more important than good architecture. Ray Tune can help to find an optimum neural network by tuning the hyperparameters.

Here, is a dummy example to show the Ray Tune operation combined with TensorBoard:

https://github.com/Vahid-GitHub/weblog/blob/main/02_raytune_checkpoint.py

Then by the following command, the TensorBoard can be opened in the Chrome browser.

tensorboard –logdir ./log/

The default server address in Google Chrome is:

http://localhost:6006/

Published inMachine LearningTools

Be First to Comment

Leave a Reply