Question

3.8) Which of the following statements is/are true?

In deep neural network, it is possible to observe changes in each layer's input

distribution even though the input layer is normalized.

Batch normalization performs calculation similar to standardization i.e., subtracting the

mean and divide by standard deviation, but batch normalization introduces two new

non-trainable parameters, one for scaling and other for shifting.

Since Batch Normalization introduces extra parameters, it increases the processing time

per epoch and also the overall convergence time.

If we apply batch normalization, we can apply higher learning rate and observe faster

convergence.

Question image 1