Your cart is currently empty!
Note: The assignment will be auto-graded. It is important that you do not use additional libraries, or change the provided functions’ input and output. Part 1: Setup Remote connect to an EWS machine. ssh (netid)@remlnx.ews.illinois.edu Load python module, this will also load pip and virtualenv module load python/3.4.3 …
Note: The assignment will be auto-graded. It is important that you do not use additional libraries, or change the provided functions’ input and output.
Part 1: Setup
ssh (netid)@remlnx.ews.illinois.edu
module load python/3.4.3
source ~/cs446sp_2018/bin/activate
cd ~/(netid)
svn cp https://subversion.ews.illinois.edu/svn/sp18-cs446/_shared/mp2 . cd mp2
pip install -r requirements.txt
mkdir data
wget –user (netid) –ask-password \ https://courses.engr.illinois.edu/cs446/sp2018/\ secure/assignment2_data.zip -O data/assignment2_data.zip
unzip data/assignment2_data.zip -d data/
svn propset svn:ignore data .
Part 2: Exercise
In this exercise we will build a system to predict housing prices. We illustrate the overall
pipeline of the system in Fig. 1. We will implement each of the blocks.
In main.py , the overall program structure is provided for you.
Figure 1: High-level pipeline
Part 2.1 Numpy Implementation
There are three csv files, train.csv , val.csv , and test.csv , each contains exam- ples in each of the dataset splits.
The format is comma separated, and the first line containing the header of each column.
Id,BldgType,OverallQual,GrLivArea,GarageArea,SalePrice
1,1Fam,7,1710,548,208500
Everything before the SalePrice may be the input to our system, and SalePrice is the quantity we hope to predict.
1Fam = [1, 0, 0, 0, 0]
2FmCon = [0, 1, 0, 0, 0]
…etc.
More details are provided in the function docstring.
– Forward operation. Forward operation is the function which takes an input and outputs a score. In this case, for linear models, it is F = w| x + b. For simplicity, we will redefine x = [x, 1] and w = [w, b], then F = w| x.
– Loss function. Loss function takes in a score, and ground-truth label and out- puts a scalar. The loss function indicates how good the models predicted score fits to the ground-truth. We will use L to denote the loss.
– Backward operation. Backward operation is for computing the gradient of the loss function with respect to the model parameters. This is computed after the forward operation to update the model.
– Gradient descent. In models/train eval model.py , we will implement gra- dient descent. Gradient descent is a optimization algorithm, where the model adjusts the parameters in direction of the negative gradient of L.
Repeat until convergence:
w(t) = wt−1 − η∇L(t−1)
The above equation is referred as an update step, which consists of one pass of the forward and backward operation.
– Linear regression also has an analytic solution, which we will also implement.
To run main.py
python main.py
– How does learning effect convergence?
– Which optimization is better, analytic solution or gradient descent?
– Are squared features better? Why?
– Which of the column features are important?
Part 3: Writing Tests
In test.py we have provided basic test-cases. Feel free to write more. To test the code,
run
nose2
Part 4: Submit
Submitting the code is equivalent to committing the code. This can be done with the
following command:
svn commit -m “Some meaningful comment here.”
Lastly, double check on your browser that you can see your code at
https://subversion.ews.illinois.edu/svn/sp18-cs446/(netid)/mp2/