It is an iterative algorithm and every step it finds the gradient of the cost function with
respect to the parameters to minimize the cost.
For a pure convex function or a regular bowl-shaped cost function, it can converge to
the optimal solution irrespective of the learning rate
Since cost functions can be of any shape, it can only achieve local minima but not global
minimum, provided the cost function is not convex.
Normalizing the variables and bringing the magnitude of these variables to same scale,
ensures faster convergence.
Fig: 1