Question 1 (For Assessment)
Given a vector of training data y and a corresponding matrix of features X (the ith row of X containing the feature vector associated with observation yi), write a short explanation of why the training error (ie the mean squared error on the training set) of the least squares prediction will never increase as more features are added. Will the test error (ie the mean squared error on an independent test set) also never increase as more features are added? Why or why not?
Work through labs 3.6.2–3.6.6 in the text book. This should give you a feeling for how linear regression models work in R.
These are some linear algebra questions that should serve as reminders for the relevant linear algebra techniques we will need in this course.
In this question, we will only consider the Euclidian norm kvk2 = Pn vi2.
Prove the Pythagorean identity: If uT v = 0 then ku + vk2 = kuk2 + kvk2.
If Φ ∈ Rn×r is a matrix with orthonormal columns, prove that kΦvk = kvk.
Let u, v ∈ Rn. What is the largest (right) eigenvalue of uvT ? What is the corresponding eigenvector?
Let A be a symmetric matrix. Prove that maxx xTTAx is equal to the largest eigenvalue of A.
What is a Symmetric Positive Definite Matrix? Is the matrix AT A be positive definite?
This is a probability question that covers a key step in our proof of the risk bound for linear regression.
∼ N(0, σ2)). Let
∈ Rn×r be a matrix with orthonormal columns. Show that ˜ = ΦT is an r-dimensional vector of iid zero mean normal random variables with variance σ2.Letbeavectorsuchthateachcomponenthasan iid zero mean normal distribution (ie
(Hint: A sum of Gaussians is always Gaussian, so it’s enough to check that each of the means is zero, the covariances are zero, and the variances for each component is σ2.)