Name: Solved-Assignment 4 -Solution
SKU: 35764
Price: 25.00 USD
Availability: InStock

Description

5/5 – (2 votes)

Problem 1 [25%]

In this exercise, we will predict the number of applications received using the other variables in the College (ISLR::College) data set.

Fit a linear model using least squares on the training set, and report the test error obtained.

Use best subset selection with cross-validation. Report the test error obtained.

Fit a ridge regression model on the training set, with λ chosen by cross-validation.

Fit a lasso model on the training set, with λ chosen by cross-validation. Report the test error obtained, along with the number of non-zero coeﬃcient estimates.

Briefly comment on the results obtained. How accurately can we predict the number of college applications received? Is there much diﬀerence among the test errors resulting from these approaches?

Problem 2 [25%]

We will try to predict per capita crime rate in the Boston dataset.

Try out best subset selection, the lasso, ridge regression, and PCR on this problem. Present and discuss results for the approaches that you consider.
Propose a model (or set of models) that seem to perform well on this data set, and justify your answer. Make sure that you are evaluating model performance using validation set error, cross-validation, or some other reasonable alternative, as opposed to using training error.

Problem 3 [25%]

Suppose we have a linear regression problem with P features. We estimate the coeﬃcients in the linear regression model by minimizing the RSS for the first p features:

X	y_i − β₀	−	X	^βj ^xij
i=1			j=1

where p ≤ P . For parts (1) through (5), indicate which of i. through v. is correct. Briefly justify your answer.

1. As we increase p from 1 to P , the training RSS will typically:

Remain constant.

Steadily increase.

Steadily decrease.

Increase initially, and then eventually start decreasing in an inverted U shape.

Decrease initially, and then eventually start increasing in a U shape.

Repeat (1) for test MSE.

Repeat (1) for squared bias.

Repeat (1) for variance.

Repeat (1) for the irreducible error (Bayes error).

Problem 4 [25%]

Suppose we estimate the regression coeﬃcients in a linear regression model by minimizing

n			p		p
X	y_i − β₀	−	X	subject to	X
	y_i − β₀	−	^βj ^xij	subject to	\|β_j \| ≤ s
i=1			j=1		j=1

for a particular value of s. For parts (1) through (5), indicate which of i. through v. is correct. Justify your answer.

1. As we increase s from 0, the training RSS will typically:

Remain constant.

Steadily increase.

Steadily decrease.

Increase initially, and then eventually start decreasing in an inverted U shape.

Decrease initially, and then eventually start increasing in a U shape.

Repeat (1) for test RSS.

Repeat (1) for (squared) bias.

Repeat (1) for variance.

Repeat (1) for the irreducible error (Bayes error).

Problem O4 [30%]

This problem can be substituted for Problem 4 above, for up to 5 points extra credit. The better score from problems 4 and O4 will be considered.

Solve Exercise 3.6 in [Bishop, C. M. (2006). Pattern Recognition and Machine Learning].

Solved-Assignment 4 -Solution

Description

Related products

Lab 1 Exploring Numerical Error Solution

Lab 3: “Thanks for All the Fish!!!” SOlution

Lab 5: Vegas Blackjack Solution

Project 3 Complex Number Calculator Solution

Homework 5: Minion Agents Solution