Search for question
Question

A - Use MATLAB’s built in cancer dataset and linear regression to create a simple discriminant function, similar to the following snippet:

[X,d] = cancer_dataset; %Type help cancer_dataset for more info

w=X'\d(2,:)'; %Training/MSE linear model creation

y=X'*w; %Activation/testing

[X,Y,T,AUC] = perfcurve(d(2,:),y',1);

figure,plot(X,Y) %Visualize

xlabel('False positive rate')

ylabel('True positive rate')

title(['2D ROC, AUC=' num2str(AUC)])

B - Keep the first half of the data for creating the linear regressor (training) and the second half for testing. Repeat the the above. Summarize your observations.

C - (required for graduates, optional as 20 bonus points for undergrads; i.e. UGs can get up to 120 points in this assignment): Find a subset of input variables for the linear regressor to see if a reduced input space performs better. Test at least 5 subsets (including the full 9-dimensional input) and use ROC AUC as your measure of success. Summarize your observations.