The gradient descent update for linear regression is:
Wi+1 = Wi · αἱ Σ(wx; - yj)x;
- Part 1 (20 points)
First, implement a function that computes the summand (w¹x - y)x,, and test this function on two examples. Use (Vectors) to create dense vectors w and use (Labeled Point)
to create training dataset with 3 features. You can also use (Breeze) to do the dot product.
Part 2 (20 points)
Implement a function that takes in vector w and an observation's Labeled Point and returns a (label, prediction) tuple. Note that we can predict by computing the dot
product between weights and an observation's features. Test this function on a Labeled Point RDD.
Part 3 (20 points)
Implement a function to compute (RMSE) given an RDD of (label, prediction) tuples:
Test this function on an example RDD.
RMSE =
-18
n
Σ(vi - 1/2
i=1 Part 4 (40 points)
Implement a gradient descent function for linear regression:
The function will take trainData (RDD of Labeled Point) as an argument and return a tuple of weights and training errors. Reuse the code that you have written in Part 1 and 2.
Initialize the elements of vector w = 0 and a = 1. Update the value of a in ith iteration using the formula:
Wi+1 = Wi-ai (w/ x; -yj)xj
Bonus (20 points)
Implement the closed form solution:
Test the function on and example RDD. Run it for 5 iterations and print the results.
You can assume X is a (DenseMarix).
α₂ =
α
n√i.
w = (x¹x)¯¹x²