A Gentle Introduction to Activation Regularization in Deep Learning

Last Updated on August 6, 2019

Deep learning models are capable of automatically learning a rich internal representation from raw input data.

This is called feature or representation learning. Better learned representations, in turn, can lead to better insights into the domain, e.g. via visualization of learned features, and to better predictive models that make use of the learned features.

A problem with learned features is that they can be too specialized to the training data, or overfit, and not generalize well to new examples. Large values in the learned representation can be a sign of the representation being overfit. Activity or representation regularization provides a technique to encourage the learned representations, the output or activation of the hidden layer or layers of the network, to stay small and sparse.

In this post, you will discover activation regularization as a technique to improve the generalization of learned features in neural networks.

After reading this post, you will know:

Neural networks learn features from data and models, such as autoencoders and encoder-decoder models, explicitly seek effective learned representations.
Similar to weights, large values in learned features, e.g. large activations, may indicate an overfit model.
The addition of penalties
To finish reading, please visit source site

Deep Learning Performance