Name: Machine Problem #3 Solution
SKU: 4288
Price: 35.00 USD
Availability: InStock

Description

5/5 – (2 votes)

Note: The assignment will be autograded. It is important that you do not use additional libraries, or change the provided functions’ input and output. Also, you should include your best-trained ‘trained weights.np’ under codefromscratch when you make submission to SVN.

Part 1: Setup

Remote connect to an EWS machine.

ssh (netid)@remlnx.ews.illinois.edu

Load python module, this will also load pip and virtualenv

module load python/3.4.3

Reuse the virtual environment from mp0.

source ~/cs446sp_2018/bin/activate

Copy mp3 into your svn directory, and change directory to mp3.

cd ~/(netid)

svn cp https://subversion.ews.illinois.edu/svn/sp18-cs446/_shared/mp3 . cd mp3

Install the requirements through pip.

pip install -r requirements.txt

Create data directory and Download the data into the data directory.

mkdir data

wget –user (netid) –ask-password \

http://courses.engr.illinois.edu/cs446/secure/assignment3_data.zip \

-O data/assignment3_data.zip

Unzip assignment3 data.zip

unzip data/assignment3_data.zip -d data/

Prevent svn from checking in the data directory.

svn propset svn:ignore data .

Part 2: Exercise

In this exercise, we will build a logistic model based binary classifier that can separate two-

class audio datasets. To be specific, the datasets consist of features extracted from audio waveforms of letters of the alphabet. The two classes represent the ee-set (b,c,d,e,g,p,t,v,z) and the eh-set (f,l,m,n,s,x), respectively. The binary classifier will be based on the logistic regression model from our lecture as well as the sigmoid function from the written homework.

The overall flow of the binary classifier is illustrated in Fig. 1 and we will implement each of the blocks in both “code-from-scratch” and “TensorFlow” parts. In main.py , the overall program structure is provided for you.

Training

Forward Operation (Compute Score/Loss)

Backward Operation (Gradient Update)

Data

(x,y) Preprocessing Model Score Loss

Output

Testing

Forward Operation (Classification)

Figure 1: High-level Flow

Part 2.1 Code-From-Scratch Implementation

Reading and Preprocessing Data.

In io tools.py , we will fill in one function for reading in the dataset as well as pre- processing it. The downloaded dataset has only “trainset” because the real “testset” belongs to autograder, which is used to evaluate the actual accuracy of your classifier. No matter “trainset” or “testset”, they share the same format. Specifically, the “train- set” has an “indexing.txt” and a folder “samples”, where the former file lists all samples line-by-line in the form of “class label sample file.txt” and the latter folder contains all sample files in ASCII text format. For more details, check the “README.txt”. After

the dataset is read, it should be preprocessed to match the specified form in return arguments of the function. More details are provided in the function docstring.

Logistic Model

In logistic model.py , we will implement a class for logistic models and use it for binary classification. The class will support the following operations:

– Forward operation. Forward operation is the function which takes input fea- tures and generate a probability score. In this case, for logistic models, it is

1+e−w x

score = g(w| x) = 1 |

, where the sigmoid function is used from the written

homework. Note that, we redefine x = [1, x] and w = [b, w].

– Backward operation. Backward operation is for computing the gradient of the loss function with respect to the model parameters. This is computed after the forward operation to update the model. Note that, in this part, loss function

should use the Pi log 1 + exp(−y(i)wT φ(x(i))) as in the lecture slides.

– Classify operation. Classify operation performs the binary classification on input dataset. Each of input samples is computed based on current update-to- date model and then is predicted as one of the two classes {+1/ − 1}. Since all samples are taken in as input, the final output will be a vector of predicted class labels.

– Fit operation. It is the training function which utilizes the above methods as necessary to fit the logistic model to the input dataset through the gradient descent approach. As in the lecture, gradient descent is an optimization algorithm, where the model adjusts its parameters in direction of the negative gradient of the loss value. The gradient descent should be repeated iteratively until convergence.

– Save/Load operation. After training with hundreds or thousands of itera- tions, the final model should be saved into a binary file using the methods – np.tofile(‘trained weights.np’) and np.fromfile(‘trained weights.np’), so that, in future, the well-trained model can be restored immediately and per- form classification tasks. Note that, you should also include your best- trained ‘trained weights.np’ under codefromscratch when submitting your code to SVN.

Running Experiments. The main.py shows the overall flow containing all previous methods. We will not grade the main.py file, feel free to modify it.

Relevant Files: everything in the folder codefromscratch

Part 2.2 TensorFlow Implementation

In this part, we will implement the model in the previous section using TensorFlow (TF). The

TF part focuses more on training method using the quadratic loss function Pi yi − g(w> xi )

where y ∈ {0, 1}, instead of the previous loss function Pi log 1 + exp(−y(i) wT φ(x(i) ))

where y ∈ {−1, 1}. Also, for simplicity, less operations are required to implement in the TF

part.

Reading and Preprocessing Data. In general, everything is the same except the class labels.

Logistic Model The class in TF part only needs to support these operations:

– Build Graph operation. This operation builds the entire training and test- ing TF graph for our logistic classifier, which includes model creation, forward score/loss computation, optimizer definition, as well as training accuracy compu- tation. It is better to put the entire TF graph in a single method, according to our experience.

– Fit operation. The training operation in TF utilizes the quadratic loss as men- tioned earlier, which also includes iteratively gradient descent update on weights as well as computation of forward score and training accuracy during each iter- ation. After the model is trained, this operation also returns the score vector of last iteration.

Running Experiments. The main.py shows the overall flow containing all previous methods. We will not grade the main.py file, feel free to modify it.

Relevant Files: everything in the folder codefromtf

Part 3: Writing Tests

In test scratch.py and test tf.py , we have provided basic test-cases. Feel free to write

more. To test the code, run

python test_*.py -v or

nose2

Part 4: Submit

Submitting the code is equivalent to committing the code. This can be done with the follow

command:

svn commit -m “Some meaningful comment here.”

Lastly, double check on your browser that you can see your code at

https://subversion.ews.illinois.edu/svn/sp18-cs446/(netid)/mp3/

Machine Problem #3 Solution

Description

Related products

Project One: Top of Pile Solution

Homework 3: Mbed Setup Solution

Lab 7: Multithreaded Numerical Integration Solution

[solved]Homework 6: Hero Agents-Solution

Project 1B: Transformation Matrices Solution