• Read Chapter 6 and Chapter 10 up to, but not including, Section 10.2.

Write a MATLAB code to perform the following handwritten digit recognition computations.

Step 01 Download the handwritten digit database


from CANVAS and load this le into your MATLAB session.

(a) This le contains four arrays

  • train patterns

  • test patterns

of size 256 4649 and

  • train labels

  • test labels

of size 10 4649. You may nd it helpful to think of these arrays as matrices. The arrays train patterns and test patterns contain a raster scan of the 16 16 gray level pixel intensities that have been normalized to lie within the range [ 1; 1]. The arrays train labels and test labels contain the true information about the digit images. That is, if the jth handwritten digit image in train patterns truly represents the digit i, then the (i + 1; j)th entry of train labels is +1, and all the other entries of the jth column of train labels are 1.

(b) Now, display the rst 16 images in train patterns using subplot(4,4,k) and imagesc functions in MATLAB.

Print out the gure and include it in your Programming Project LaTeX and PDF les.

[Hint: You need to reshape each column into a matrix of size 16 16 followed by transposing it in order to display it correctly.]

Step 02 Read the description of this step in Chapter 10.01 of the textbook and/or Professor Saito’s Lecture 21. Compute the mean digits in the train patterns and put them in a matrix called train aves of size 256 10, and display these 10 mean digit images using subplot(2,5,k) and imagesc. Print out the gure as a PDF le and include it in your LaTeX and PDF documents.

Hint: You can gather (or pool) all the images in train patterns corresponding to digit k 1 (1 k 10) using the following MATLAB command:

>> train_patterns(:, train_labels(k,:)==1);

Step 03 Read the description of this step in Chapter 10.01 of the textbook and/or Professor Saito’s Lecture 21. Now conduct the simplest classi cation computations as follows.

(a) First, prepare a matrix called test classif of size 10 4649 and ll this matrix by computing the Euclidean distance (or its square) between each image in the test patterns and each mean digit image in train aves.

  • Hint: the following line computes the squared Euclidean distances between all of the test digit images and the kth mean digit of the training dataset with one line of MATLAB code:

    • sum((test_patterns-repmat(train_aves(:,k),[1 4649])).^2);


© 2017 Prof. N. Saito (Revised by Prof. E. G. Puckett) { 1 { Revision 2.00 Thu 19th May, 2016 at 13:38


(b) ) Compute the classi cation results by nding the position index of the minimum of each column of test classif.

Put the results in a vector test classif res of size 1 4649.

  • Hint: You can nd the position index giving the minimum of the jth column of test classif by

    • [tmp, ind] = min(test_classif(:,j));

Then, the variable ind contains the position index, an integer between 1 and 10, of the smallest entry of test classif(:,j). ]

(c) ) Finally, compute the confusion matrix test confusion of size 10 10, print out this matrix, and submit your results in the PDF le containing your report.

[ Hint: First gather the classi cation results corresponding tothe k 1st digit by >> tmp=test\_classif\_res(test_labels(k,:)==1);

This tmp array contains the results of your classi cation of the test digits whose true digit is k 1 for 1 k 10. In other words, if your classi cation results were perfect, all the entries of tmp would be k. But in reality, this simplest classi cation algorithm makes mistakes, so tmp contains values other than k. You need to count how many entries have the value j in tmp, for j = 1 : 10. This will give you the kth row of the test confusion matrix.


Step 04 Read the description of this step in Chapter 10.02 of the textbook and/or Professor Saito’s Lecture 21. Now conduct an SVD-based classi cation computation.

(a) Pool all of the images corresponding to the kth digit train patterns, compute the rank 17 SVD of that set of images, i.e., the rst 17 singular values and vectors, and put the left singular vectors (or the matrix U) of the kth digit into the array train u of size 256 17 10. For k = 1 : 10, you ca do this with the following code:

>> [train_u(:,:,k),tmp,tmp2] = svds(train_patterns(:,train_labels(k,:)==1),17);

You do not need the singular values and right singular vectors in this computation.

(b) Compute the expansion coe cients of each test digit image with respect to the 17 singular vectors of each train digit image set. In other words, you need to compute 17 10 numbers for each test digit image. Put the results in the 3D array test svd17 of size 17 4649 10. This can be done with the commands

>> for k=1:10

test_svd17(:,:,k) = train_u(:,:,k)’ * test_patterns; end

(c) Next, compute the error between each original test digit image and its rank 17 approximation using the kth digit images in the training data set. The idea of this classi cation is that a test digit image should belong to the class of the kth digit if the corresponding rank 17 approximation is the best approximation (i.e., the smallest error) among 10 such approximations. Prepare a matrix test svd17res of size 10 4649, and put those approximation errors into this matrix.

  • Hint: The rank 17 approximation of test digits using the 17 left singular vectors of the kth digit training images can be computed by train u(:,:,k)*test svd17(:,:,k); ]

(d) Finally, compute the confusion matrix using this SVD-based classi cation method by following the same strategy as in Step 03(b) and fStep 03(c) above. Name this confusion matrix test svd17 confusion. Include this matrix in your report and submit your results.


© 2017 Prof. N. Saito (Revised by Prof. E. G. Puckett) { 2 { Revision 2.00 Thu 19th May, 2016 at 13:38


(a) For Step 01 explain your understanding of the data structure in which the images of the digits are stored. In particular, include a brief explanation of the di erence between the training data and the test data. (This is a simple example of machine learning. These are most likely the rst machine learning algorithms to be widely used in the ‘real world’.)

(b) Give an explanation of what you are doing in Step 02, and why you are doing it. You will nd some help-ful comments concerning Step 02 in Chapter 10.01 of the textbook. Include some thoughts to support your comments.

(c) Comment on the intermediate results at the end of Step 03 and at the end of Step 04. How e ective is each algorithm; i.e, for that particular algorithm what percentage of each digit is identi ed correctly? Which digit is the most di cult to identify correctly? Which digit is the easiest to identify correctly? You can obtain all of this information from the confusion matrices you produced in Step 03 and Step 04. Include some thoughts to support your comments. In particular, in YOUR OWN WORDS explain the theory that is behind the algorithm in (a){(d). (This is discussed in detail in Chapter 10.2 of the textbook.)

(d) Summarize all of your results in a separate section at the end. Compare your results from Step 03, and Step 04. Which of the two algorithms yields the best result? Why?

Step 06 Submit a well documented MATLAB program named

\Digit Recognition youremailname.m”

This program should perform all of the tasks in Step 01 to Step 04 above without any user input. It is su cient to have your program print the various images and tables on the computer screen. In particular, your program does not have to have produce a PDF le containing the images of the digits produced in Step 01(b) and Step 02.

Again, here is a description of what is meant by a well documented MATLAB program.

DO NOT submit only the MATLAB source code without comments. Furthermore, DO NOT include the bare minimum of explanation for each subsection of your code. Please consider using an active mind when including comments in your program. In particular, as technicians and highly educated individuals, it is worth your time to describe what you are doing IN OUR OWN WORDS for each individual segment of the code; i.e., each portion of the code that performs a separate task, even if it is ‘only’ inputting a le. For example, ‘What is the format of the le: binary, text, MATLAB data structures? What is contained in the le? How is it stored? Relate the algorithm(s) back to the theory we have been studying in lecture and in the homework assignments. When you read your own code, you should be able to easily identify what you have learned from this writing the program, and how this relates to the themes presented in lectures and in the textbook.

© 2017 Prof. N. Saito (Revised by Prof. E. G. Puckett) { 3 { Revision 2.00 Thu 19th May, 2016 at 13:38