Batch normalization is a technique for improving the performance and stability of artificial neural networks. It is a technique to provide any layer in a neural network with inputs that are zero mean/unit variance. Batch normalization was introduced in a 2015 paper. It is used to normalize the input layer by adjusting and scaling the activations.
Stochastic Gradient Descent
Learning Rate Schedules
Adaptive Methods: AdaGrad, RMSProp, and Adam
Internal Covariate Shift