Homework 03 Solution

$26.99

Category:

Description

Introduction

 

This homework will focus on writing functions on lists and proving properties  of them.  This homework is longer and harder  than  the previous two:  start  early!

 

 

1.1     Getting The Homework Assignment

 

The starter files for the homework assignment have been distributed through  our git repos- itory, as usual.

 

 

1.2     Submitting The Homework Assignment

 

Submissions will be handled  through  Autolab,  at

 

https://autolab.cs.cmu.edu

 

In preparation for submission, your hw/03 directory  should contain  a file named exactly

hw03.pdf containing  your written  solutions to the homework.

To submit your solutions, run make from the hw/03 directory (that contains a code folder and a file hw03.pdf). This should produce a file hw03.tar, containing  the files that  should be handed  in for this homework assignment.   Open the Autolab  web site, find the page for this assignment,  and submit  your hw03.tar file via the “Handin  your work” link.

 

The Autolab handin script does some basic checks on your submission:  making sure that the file names are correct; making sure that  no files are missing; making sure that  your code compiles cleanly.  Note that  the handin  script  is not  a grading script—a  timely submission that  passes the handin script will be graded, but will not necessarily receive full credit.  You can view the results of the handin script by clicking the number corresponding to the “check” section of your latest handin on the “Handin History” page. If this number is −10.0, your submission failed the check script; if it is 0.0, it passed.

Remember  that  your written  solutions  must  be submitted in PDF  format—we  do not accept MS Word files or other formats.

 

Your  hw03.sml file must  contain  all the  code that  you want  to  have  graded  for this assignment,  and must compile cleanly.  If you have a function that  happens to be named the same as one of the  required  functions  but  does not  have the  required  type, it will not  be graded.

 

 

 

1.4     Methodology

 

You must  use the five step methodology  discussed in class for writing functions,  for every

function you write in this assignment.  Recall the five step methodology:

 

  1. In the first line of comments, write the name and type of the function.

 

  1. In the second line of comments, specify via a REQUIRES clause any assumptions about the arguments  passed to the function.

 

  1. In the third line of comments, specify via an ENSURES clause what  the function com- putes (what  it returns).

 

  1. Implement the function.

 

  1. Provide testcases, generally in the format

val <return value> = <function> <argument value>. For example, for the factorial function presented  in lecture:

(* fact : int -> int

* REQUIRES: n >= 0

* ENSURES: fact(n) ==> n!

*)

 

fun fact (0 : int) : int = 1

| fact (n : int) : int = n * fact(n-1) (* Tests: *)

val 1 = fact 0

val 720 = fact 6

 

2    Zippidy Doo  Da

 

It’s often convenient to take a pair of lists and make one list of pairs from it.  For instance, if we have the lists

[“a”, “b”]        and        [5, 1, 2, 1]

 

we might be interested in the list

 

[(“a”, 5), (“b”, 1)]

 

 

Task 2.1 (5%).  Write  the function

 

zip : string list * int list -> (string * int) list

 

that  performs the transformation of pairing the nth element from the first list with the nth element of the second list.  If your function is applied  to a pair of lists of different lengths, the length of the returned  list should be the minimum of the lengths of the argument lists.

You should ensure that  zip is a total  function  (but  you do not need to formally prove this fact).

 

Some terminology:

A function f : t -> t’ is total iff the expression f(x) reduces to a value for all values x :

t.

 

Task 2.2 (5%).  Write  the function

 

unzip : (string * int) list -> string list * int list

 

unzip does the opposite of zip, in the sense that  it takes a list of pairs and returns  a pair of lists, where the first list in the pair is the list of first elements and the second list is the list of second elements.  You should ensure that  unzip is a total  function.  You may assume this fact in your proofs but  should cite it if used.

 

Task 2.3 (15%). Prove the following Theorem  by induction  on length of lists.  1.

 

Theorem 1.  For  all values l : (string * int) list, zip(unzip l) = l.

 

Hint:  Look at the way we reasoned about eval decimal n = n in Lecture 3 and 4 notes.

 

Task 2.4 (5%).  Prove or disprove Theorem  2.

 

Theorem 2.  For  all values l1 : string list and l2 : int list,

 

unzip(zip (l1,l2)) = (l1,l2)

 

3    Look And Say

 

3.1     Definition

 

If l is any list of integers, the look-and-say list of l is obtained  by reading off adjacent groups of identical elements in l. For example, the look-and-say list of l = [2, 2, 2] would be read as “three  twos”.  For our implementation of look-and-say, this would be represented  as [(3, 2)].

Similarly, the look-and-say list of

 

l = [1, 1, 2]        is    [(2, 1), (1, 2)]

 

because l is exactly “two ones, then one two.”

We  will use the  term  run  to  mean  a  maximal  length  sublist  of a  list  with  all equal elements.  For example,

 

are both runs of the list but

[1, 1, 1]        and        [5] [1, 1, 1, 5, 2]

[1, 1]        and        [5, 2]        and        [1, 2]

 

 

are not:  [1, 1] is not maximal, [5, 2] has unequal elements, and [1, 2] is not a sublist.

You will now define a function lookSay that  computes the look-and-say sequence of its argument using a helper function and a new pattern of recursion.

 

 

3.2     Implementation

 

Before defining the  lookSay function,  you will write a helper function  runWith. runWith

will remove a “run”  from the front of a list.

 

Task 3.1 (5%).  Write  the function runWith that  satisfies the following spec:

runWith: int*int list -> int list * int list runWith (x, L) = (L1, L2)

where L = L1@L2

and every element of L1 is equal to x and L2 does not begin with x.

 

Note that  L2 is either [ ] or has the form y::R where x = y. For example,

 

runWith(1,[1,2,3]) = ([1], [2,3]) runWith(1,[1,1,2,3]) = ([1,1], [2,3]) runWith(3,[1,2,3]) = ([], [1,2,3])

 

Note that  you can use the function inteq in hw03.sml to compare integers for equality.

 

Task 3.2 (10%). Now, write the function lookSay : int list -> (int * int) list using this helper function.1

Task 3.3 (3%).  Write  the function

 

flatten : (int * int) list -> int list

 

that  will “flatten  out” the list of pairs into a flat list of integers.

For example,

 

flatten([(1, 2)]) = [1,2]

flatten([(1, 2), (3, 4), (5, 6)]) = [1,2,3,4,5,6]

 

 

 

3.3     Cultural Aside

 

Repeated  applications  of our “look and  say” function,  followed by our “flatten”  function, results in a special sequence.

Below are the first 5 steps in the instance  of this sequence beginning with [1]:

 

[1] [1,1] [2,1] [1,2,1,1]

[1,1,1,2,2,1] [3,1,2,2,1,1]

 

This sequence is noted by Conway’s theorem,  which states  that  any element of this sequence will “decay” (by repeated  applications  of lookSay and flatten) into a “compound”  made up of combinations  of “primitive  elements” (there  are 92 of them, plus 2 infinite families) in

24 steps.  If you are interested  in this sequence, you may wish to consult [Conway(1987)] or other papers about  the “look and say” operation.

 

 

 

 

 

 

 

 

 

 

1  Hint:  The recursive  call in the inductive  case of lookSay will sometimes  be on a list that is more than one element shorter.

 

4    Prefix-Sum

 

The prefix-sum of a list l is a list s where the ith index element of s is the sum of the first

i + 1 elements of l. For example,

 

prefixSum [] = []

prefixSum [1,2,3] = [1,3,6]

prefixSum [5,3,1] = [5,8,9]

 

Note that  the first element of list is regarded as position 0.

 

Task 4.1 (5%).  Implement the function

 

prefixSum : int list -> int list

 

that  computes the prefix-sum.  You must  use the addToEach function provided, which adds an integer  to each element  of a list, and your solution  must  be in O(n2)  but  not  in O(n). This implementation will be simple, but  inefficient.

 

Task 4.2 (5%). Write a recurrence for the work of prefixSum, WprefixSum(n), where n is the length of the input  list.  Give a closed form for this recurrence.  Argue that  your closed form does indeed indicate  that  WprefixSum(n) is O(n2).

You may use variables k0, k1, . . . for constants.  You should assume that  addToEach is a linear time function:  addToEach l evaluates  to a value in kn steps where n is the length of l and k is some constant;  your recurrence should involve the constant k.

 

 

 

In order to compute  the  prefix sum operation  in linear time,  we will use the  technique  of adding an additional  argument:  harder  problems can be easier.

 

Task 4.3  (10%).  Write  the  prefixSumHelp function  that  uses an additional  argument  to compute  the prefix sum operation  in linear time.  You must  determine  what  the additional argument should be. Once you have defined prefixSumHelp, use it to define the function

 

prefixSumFast : int list -> int list

 

that  computes the prefix sum.

 

Task 4.4 (5%). Write a recurrence for the work of prefixSumFast, WprefixSumFast(n), where n is the  length  of the  input  list.  Give a closed form for this  recurrence.   Argue that  your closed form does indeed indicate  that  prefixSumFast is in O(n).

 

5    Sublist

 

When programming  with lists, we often need to work with a segment of a larger list.  For example, one might need to access only the last three  elements of a list or only the middle element.  Any such segment is called a sublist.

More formally:  if L is any list, we say that  S is a sublist of L starting  at i if and only if there exist l1 and l2 such that

 

 

 

and

l1@S@l2 = L

 

 

length l1 = i

 

 

For example, [1, 2] is a sublist of [1, 2, 3] starting  at 0 because

 

[ ]@[1, 2]@[3] = [1, 2, 3]       and        length [ ] = 0

 

 

Task 5.1 (3%). The spec for a function sublist that  computes sublists as defined above will have the form:

 

For  all  l:int list, i:int, k:int, if                                                  then  there exists an S such that  S is the sublist of l starting  at i, and

 

length S = k   and        sublist(i, k, l) = S

 

 

The blank is called the  preconditions,  and represents  assumptions  about  the input.   Fill in the  blank  to complete  this  spec correctly.   Note that  this  formal spec should very closely resemble any REQUIRES  and ENSURES clauses on your function.

 

Task 5.2 (7%).  Implement a function

 

sublist : int * int * int list -> int list

 

that  meets the spec you gave above.

Because the  spec has the  form of an  implication,  in the  body of sublist you should assume that  whatever  preconditions  you required in Task 5.1 are met:  if they are not, your function can do anything  you want and still meet its spec!

Note that  the definition above implies that we index lists from zero, so

 

sublist (0, 3, [1,2,3,4]) = [1,2,3]

 

The spec that  you completed above is good because it closely mirrors the abstract notion of a sublist,  but  bad because it’s very stringent:   any code calling sublist  must  ensure that the assumptions  about  the input  hold or else it will fail.  Since the exact mode of failure is not documented  in the type or in the spec, this can produce behaviour  that’s  very hard  to debug.

Sometimes, the caller will be able to prove that  these assumptions  hold because of other specification-level information.   Other  times, the information  available at  compile-time will not be enough to ensure that  these assumptions  are met.  In these circumstances,  you can use a run-time  check to bridge the  gap, which is something  we will be able to implement once we see exceptions.

 

6    Subset sum

 

A multiset  is a slight generalization  of a set where elements can appear  more than  once. A submultiset of a multiset  M is a multiset,  all of whose elements are elements of M . To avoid too many awkward sentences, we will use the term subset to mean submultiset.

It  follows from the definition  that  if U is a sub(multi)set of M , and  some element  x appears  in U k times, then  x appears  in M  at least k times.  If M  is any finite multiset  of integers, the sum of M is

X x

x∈M

With these definitions, the multiset  subset sum problem is answering the following question.

 

Let M be a finite multiset  of integers and n a target  value.  Does there exists any subset U of M such that  the sum of U is exactly n?

 

Consider the subset sum problem given by

 

M = {1, 2, 1, −6, 10}                n = 4

 

The answer is “yes” because there exists a subset of M that  sums to 4, specifically

 

U1 = {1, 1, 2}

 

It’s also yes because

 

U1 = {−6, 10}

 

 

sums to 4 and is a subset of M . However,

 

U3 = {2, 2}

 

is not a witness to the solution to this instance.  While U3 sums to 4 and each of its elements occurs in M , it is not a subset of M because 2 occurs only once in M but  twice in U2.

 

 

Representation   You’ll implement three solutions to the subset sum problem.  In all three, we represent multisets  of integers as SML values of type int list, where the integers may be negative.   You should think  of these  lists as just  an  enumeration of the  elements  of a particular multiset.  The order that  the elements appear  in the list is not important.

 

 

6.1     Basic solution

 

 

Task 6.1 (10%). Write  the function

 

subsetSum : int list * int -> bool

 

that  returns  true if and only if the input  list has a subset that  sums to the target  number. As a convention,  the  empty  list [ ] has a sum of 0.  Start  from the  following useful fact: each element of the set is in the subset, or it isn’t.2

 

 

6.2     NP-completeness and certificates

 

Subset sum is an interesting  problem because it is NP-complete.  NP-completeness  has to do with the time-complexity  of algorithms,  and is covered in more detail in courses like 15-251, but  here’s the basic idea:

 

  • A problem is in P if there is a polynomial-time algorithm for it—that is, an algorithm one whose work is in O(n),  or O(n2),  or O(n14 ), etc.
  • A problem is in NP if an affirmative answer can be verified in polynomial time. Subset sum is in NP. Suppose that you’re presented with a multiset  M , another  multiset

U , and an integer n.  You can easily check that  the sum of U is actually  n and that  U is a

subset of M in polynomial time.  This is exactly what the definition of NP requires.

This means we can write an implementation of subset sum which produces a certificate  on affirmative instances  of the  problem—an  easily-checked witness that  the computed  answer is correct.  Negative instances  of the problem—when there is no subset that  sums to n—are not so easily checked.

You will now prove that  subsetSum is in NP  by implementing  a certificate-generating version.

 

Task 6.2 (7%).  Write  the function

 

subsetSumCert : int list * int -> bool * int list

 

such  that   for  all  values  M:int list and  n:int, if  M has  a  subset  that   sums  to  n,

subsetSumCert (M, n) = (true, U) where U is a subset of M which sums to n.

If no such subset exists, subsetSumCert (M, n) = (false, nil). 3

 

Task 6.3 (Extra  Credit).The P = NP problem, one of the biggest open problems in computer science, asks whether  there  are polynomial-time  algorithms  for all of the  problems in NP. Right now, there  are problems in NP, such as subset sum, for which only exponential-time algorithms  are known.  However, it is known that  subset sum is NP-complete,  which means that  if you could solve it in polynomial  time,  then  you could solve all problems  in NP  in polynomial time, so P = NP. So, for extra  credit,  several million dollars, and a PhD,  define in SML a function that  solves the subset sum problem and has polynomial time work.

 

2  Hint:   It’s easy to  produce  correct  and  unnecessarily  complicated  functions  to  compute  subset  sums. It’s almost  certain  that your  solution  will have  O(2n ) work, so don’t  try  to optimize  your  code too much. There  is a very clean way to write this in a few (5-10ish) elegant lines.

3 You’ll note  that the  empty  list  returned when a qualifying  subset  does not  exist  is superfluous;  soon, we’ll cover a better way to handle  these kinds of situations, called option types.

 

References

 

[Conway(1987)]  J. Conway.  The weird and wonderful chemistry  of audioactive  decay.  In T. Cover and B. Gopinath, editors,  Open Problems  in Communication and Computation, pages 173–188. Springer-Verlag,  1987.