A Gentle Introduction to Weight Constraints in Deep Learning

Last Updated on August 6, 2019

Weight regularization methods like weight decay introduce a penalty to the loss function when training a neural network to encourage the network to use small weights.

Smaller weights in a neural network can result in a model that is more stable and less likely to overfit the training dataset, in turn having better performance when making a prediction on new data.

Unlike weight regularization, a weight constraint is a trigger that checks the size or magnitude of the weights and scales them so that they are all below a pre-defined threshold. The constraint forces weights to be small and can be used instead of weight decay and in conjunction with more aggressive network configurations, such as very large learning rates.

In this post, you will discover the use of weight constraint regularization as an alternative to weight penalties to reduce overfitting in deep neural networks.

After reading this post, you will know:

Weight penalties encourage but do not require neural networks to have small weights.
Weight constraints, such as the L2 norm and maximum norm, can be used to force neural networks to have small weights during training.
Weight constraints
To finish reading, please visit source site

Deep Learning Performance