Question

3.3) Which of the following statements is/are true? When applying momentum optimization, setting higher momentum co-efficient will always lead to faster convergence. Unlike in gradient descent, in momentum optimization the gradients

at a step is dependent on the previous step. Unlike in stochastic gradient descent, in momentum optimization the path to convergence is faster but with high variance.

Fig: 1