Assignment #1 Solution



List of assignment due dates.

The assignment should be submitted via Blackboard.

Note: All data needed for this assignment can be downloaded from

Task 1 (40 points)

Figure 1: Bounding box of the person in frame 62 of the “walkstraight” sequence.

Write a Matlab function called find_bounding_box that takes in as argument the name of an image file from the “walkstraight” sequence, and computes the bounding box of the person. The function should RETURN the bounding box, as a matrix of four numbers: [top row, bottom row, left column, right column]. Furthermore, as a side effect, the function should display a figure that shows the original image, with a yellow (color code: [255 255 0]) rectangle superimposed, representing the detected bounding box. Your function can use data from any frame of the sequence in order to determine the bounding box for the frame in question.

Your function should be named find_bounding_box, and should take a single argument, i.e., the filename specifying a frame of the sequence. For example:

> find_bounding_box('walkstraight/frame0052.tif');

Don’t worry about how the function works when the person is not visible, or is only partially visible.

Task 2 (40 points)

Write a Matlab function called person_present that can tell when no person is present. Don’t worry about how your algorithm performs on borderline cases, like frames 5-32 when the person is not fully visible. However, your algorithm should be able to tell, for example, that there is no person at frame 3, and that there is a person at frame 62. The function should return 1 if the person is present, and 0 otherwise.

Your function should be named person_present, and should take a single argument, i.e., the filename specifying a frame of the sequence. For example:

> person_present('walkstraight/frame0052.tif');

Task 3 (20 points)

Write a Matlab function called person_speed that returns the average velocity of the person, between two frames. The function should return a 1×2 matrix [rows_per_frame, cols_per_frame], that specifies, in pixels, the velocity of the person along the vertical direction (rows, increasing from top to bottom) and the horizontal direction (columns, increasing from left to right).

Your solution can be built on top of your find_bounding_box function: call find_bounding_box twice, to find the person in both frames, and calculate the velocity based (somehow) on the results of the find_bounding_box function. Notice that your function must return the velocity, which specifies the direction of motion (and that is why it needs to be a 2D vector), NOT the speed, which is a single number.

Your function should be named person_speed, and should take two arguments, i.e., the filenames specifying two frames of the sequence. For example:

> person_speed('walkstraight/frame0052.tif', 'walkstraight/frame0062.tif');

Again, don’t worry about how the function works when the person is not visible, or is only partially visible.

Optional Task (just for fun and no credit)

Design and implement a computer vision-based heuristic, that can tell us something about the pose of the person. One pose (let’s call it Pose 1) is exemplified at frames 48, 67, and 84 (among others), where one leg is extended forward and another leg is extended backwards. Another pose (let’s call it Pose 2) is exemplified at frames 40, 58, and 75, where the legs are next to each other. You can incorporate your heuristic into the solution for task 1 (the function can print, as a side effect, POSE 1 or POSE 2). Again, don’t worry about cases in between, like frames 56 or 78, in that case just let your program print its best guess.

Hints and Suggestions

  • The walkstraight sequence can be downloaded from here.
  • Most of the solution for Task 1 is included in the code we covered in the introductory slides. You just need to package it up nicely as a single function.
  • Use the addpath function if you need to let Matlab know where to find directories containing user-defined functions. Type help addpath to see how that works, or see examples in the code posted on the course website.
  • In general, familiarize yourselves with the code we used in the introductory slides. You will find lots of Matlab tricks there that can be handy for this assignment.
  • File draw_rectangle.m implements a function that draws a rectangle.
  • Files parse_frame_name.m and make_frame_name.m contain code that you should feel free to use, and that you may find useful if you want your code to automatically figure out the filename of the next frame, or previous frame, and so on. For example, try:
[sequence_name, frame] = parse_frame_name('walkstraight/walkstraight0062.tif');
filename = make_frame_name(sequence_name, frame+1);

Things to note

  • The correct solutions should be functions, not scripts. See the read_gray.m file to see an example of a function.
  • There is no single unique answer for any of the tasks. Just make sure your solution behaves reasonably well.
  • It goes without saying that the solutions should be based only on computer vision, not on tricks like using the frame numbers. For example, if your solution for task 2 simply checks if the frame numbers are too small or too large, that is not a computer vision-based solution.

How to submit

Submissions are only accepted via Blackboard. Submit a file called, containing the following files:

  • The Matlab source files implementing your solutions to the programming tasks.
  • Any additional Matlab source files that are needed to run your code. If your code needs any code files available on the course website, please those files with your submission.
  • A README.txt file containing the following:
    • Name and UTA ID of the student.
    • A description, in text, of your solution for Task 2, and how well it worked (examples where it worked, examples where it didn’t work, if any).
    • A description, in text, of your solution for Task 3, and how well it worked (examples where it worked, examples where it didn’t work, if any).

We try to automate the grading process as much as possible. Not complying precisely with the above instructions and naming conventions causes a significant waste of time during grading, and thus points will be taken off for failure to comply, and/or you may receive a request to resubmit.

Please only include source code in your submissions. Do not include data files.

Code must run in Matlab version 2018b.

The submission should be a ZIP file. Any other file format will not be accepted.