Rectified Linear Unit is computationally cheaper than both sigmoid and tanh activation
function.
Using ReLU activation function brings out model sparsity.
Since ReLU activation function is a linear region for positive input values, it could lead to
an "dying ReLU" issue.
By avoiding the activated value to be 0 along negative axis, "dying ReLU" issue can be
addressed.
Fig: 1