Question

2.2) Which of the following statement(s) is/are true? By adding one or more layers to perceptron network with activation functions, non- linear separable cases can be handled. For a non-linearly separable

dataset, if activation function applied on the hidden layer(s) of the Multi-layer Perceptron network is a linear function, the model will converge to an optimal solution. For a linearly separable dataset, applying non-linear activation function such as sigmoid or tanh on hidden layers of a MLP network, can converge to a good solution. All of the above

Fig: 1