How to Configure the Learning Rate When Training Deep Learning Neural Networks

Last Updated on August 6, 2019

The weights of a neural network cannot be calculated using an analytical method. Instead, the weights must be discovered via an empirical optimization procedure called stochastic gradient descent.

The optimization problem addressed by stochastic gradient descent for neural networks is challenging and the space of solutions (sets of weights) may be comprised of many good solutions (called global optima) as well as easy to find, but low in skill solutions (called local optima).

The amount of change to the model during each step of this search process, or the step size, is called the “learning rate” and provides perhaps the most important hyperparameter to tune for your neural network in order to achieve good performance on your problem.

In this tutorial, you will discover the learning rate hyperparameter used when training deep learning neural networks.

After completing this tutorial, you will know:

  • Learning rate controls how quickly or slowly a neural network model learns a problem.
  • How to configure the learning rate with sensible defaults, diagnose behavior, and develop a sensitivity analysis.
  • How to further improve performance with learning rate schedules, momentum, and adaptive learning rates.

Kick-start your project with my new
To finish reading, please visit source site