Question

H=\bigcup_{i=1}^{k} H_{i} Consider two alternative approaches: 1. Learn on the sample using the ERM rule. 2. Divide that sample into a training set of size (1 – a)m, and a

validation set of size amfor some a e (0,1). Then apply the approach of model selection using validation, i.e.: • First, train each class H¡ on the (1 – a)m training examples using the ERM rulew.r.t. H; and let h1,..,h be all the resulting hypotheses. • Second, apply the ERM rule w.r.t. to the finite class {h1,..,hg} on the am vali-dation examples. Under which conditions is the second approach better? Justify your answerformally (using maximum 2 pages). Let H1, … · ,Hk be hypothesis classes such that H1 C H2 C … c Hk and |H1 = 2', for every i E {1,..,k}. Suppose you are given a sample of size m (with each element chosen i.id.), and you want to learn the union of all these classes, that is you would like to learn the hypothesis class

Question image 1Question image 2Question image 3Question image 4Question image 5Question image 6Question image 7Question image 8