Homework #7 Solution




1    Introduction


1.1     Getting The Homework Assignment


The starter files for the homework assignment have been distributed through  our git repos- itory, as usual.



1.2     Submitting The Homework Assignment


Submissions will be handled  through  Autolab,  at




In preparation for submission, your hw/07 directory  should contain  a file named exactly

hw07.pdf containing  your written  solutions to the homework.

To submit your solutions, run make from the hw/07 directory (that contains a code folder and a file hw07.pdf). This should produce a file hw07.tar, containing  the files that  should be handed  in for this homework assignment.   Open the Autolab  web site, find the page for this assignment,  and submit  your hw07.tar file via the “Handin  your work” link.


The Autolab handin script does some basic checks on your submission:  making sure that the file names are correct; making sure that  no files are missing; making sure that  your code compiles cleanly.  Note that  the handin  script  is not  a grading script—a  timely submission that  passes the handin script will be graded, but will not necessarily receive full credit.  You can view the results  of the handin  script  by clicking the number  (usually  either  0.0 or 1.0) corresponding  to the  “check” section of your latest  handin  on the  “Handin  History”  page. If this number  is 0.0, your submission failed the check script; if it is 1.0, it passed.

Remember  that  your written  solutions  must  be submitted in PDF  format—we  do not accept MS Word files or other formats.

Your  hw07.sml file must  contain  all the  code that  you want  to  have  graded  for this assignment,  and must compile cleanly.  If you have a function that  happens to be named the same as one of the  required  functions  but  does not  have the  required  type, it will not  be graded.


1.3     Due Date





1.4     Methodology


You must  use the five step methodology  discussed in class for writing functions,  for every

function you write in this assignment.  Recall the five step methodology:


  1. In the first line of comments, write the name and type of the function.


  1. In the second line of comments, specify via a REQUIRES clause any assumptions about the arguments  passed to the function.


  1. In the third line of comments, specify via an ENSURES clause what  the function com- putes (what  it returns).


  1. Implement the function.


  1. Provide testcases, generally in the format

val <return value> = <function> <argument value>. For example, for the factorial function presented  in lecture:

(* fact : int -> int

* REQUIRES: n >= 0

* ENSURES: fact(n) ==> n!



fun fact (0 : int) : int = 1

| fact (n : int) : int = n * fact(n-1) (* Tests: *)

val 1 = fact 0

val 720 = fact 6


2    Regular Expressions


In class, we introduced  six different  constructors  to describe regular  expressions.   We will be extending  this definition with two new constructors: Whatever and Both. The extended datatype is given below.


datatype regexp = Zero

| One

| Char of char

| Plus of regexp * regexp

| Times of regexp * regexp

| Star of regexp

| Whatever

| Both of regexp * regexp


The new constructors  have the following definitions:


  • Whatever is a string wildcard that accepts whatever, that  is any finite list of characters:


L (Whatever) = {L | L is a list of characters}.



  • Both(R1, R2) accepts a string if and only if it is in both L (R1) and L (R2):


L (Both(R1 , R2)) = {L | L ∈ L (R1)  and L ∈ L (R2 )}.


The  regular  expression  matcher  match from class is included  in the  support  code for the  assignment.   We have extended  the  datatype definition of regexp to include the  new constructors   Whatever and  Both.   Your  job  is to  extend  match to  deal  with  these  new constructors, and  prove the  correctness  of your implementation.  We define correctness  as follows:


Theorem 1 (Correctness). For  R : regexp, let P (R)  be the following statement: For  all values L : char list and total functions p : char list → bool,

  1. match R L p = true if there exist L1, L2 : char list such that L = L1@L2, L1  ∈ L (R),  and p(L2) = true,
  2. match R L p = false otherwise.


Then,  match is correct  if for all R : regexp, P(R)  holds.



In each of the following coding tasks, we strongly recommend that  you think through  the correctness spec when you are writing the code. If you’re stuck on the implementation, try doing the proof first—this will guide you to the answer.


Task 2.1 (8%).  Implement the case of match for Whatever, the string wildcard.



Task 2.2 (10%). Implement the case of match for Both(R1 , R2).



Task 2.3  (7%).   We give part  (b) of the  correctness  proof of Times(R1, R2)  below, with certain  parts  left out.  Fill in the seven sections of blanks with the appropriate statements. Note that  some of the sections have more than  one blank, and some of the sections are used more than  once. Each numbered  section is worth 1 point.


Proof.  Assume L, p such that  match (Times(R1, R2))  L p = true. We need to show that

∃L1, L2 such that  L1@L2 = L with L1 ∈ L (Times(R1 , R2))  and p (L2) = true.


Inductive Hypotheses Assume P (R1)  and P (R2),  i.e.



  1. ∀L1 : char list, p1 : char list → bool s.t. p1 total,  if match R1  L1 p1 = true,

then ∃L11, L21  s.t.  L11 @L21  = L1,  L11  ∈ L (R1 ),  p1  L21  = true

  1. ∀L2 : char list, p2 : char list → bool s.t. p2 total,  if match R2  L2 p2 = true,

then ∃L12, L22  s.t.L12 @L22  = L2,  L12  ∈ L (R2 ),  p2  L22  = true.


By stepping:


match (Times(R1, R2 )) L p

= case Times(R1,R2 ) of … | Times(R1,R2) => … | …

= (1)                            


By assumption,  match (Times(R1, R2))  L p = true. So by transitivity


(i)   (1)                             = true


Thus,  by (2)                             , taking p1  to be


(fn L’ => match R2   L’ p),


L1 to be L, and using (i) to satisfy the premise, we know that  ∃L11 , L21 such that  (3)                             ,

                            , and                              .



By stepping


(fn L’ => match R2   L’ p)L21

= match R2   L21   p


Thus,  by transitivity,


(ii)   match R2   L21   p = true


By  (4)                             ,  taking  p2   to  be  p  and  L2   to  be  L21   and  using  (ii)  to  satisfy the  premise, we know that  ∃L12 , L22  such that (5)                             ,                              , and


Since L11  ∈ L (R1 ) and L12  ∈ L (R2 ), (6)                             ∈ L (Times(R1, R2 )) by (7)                             . Now, take L1 to be (6)                             and L2 to be L22 . Then, L1 = (6)                             ∈

L (Times(R1, R2 )),  and  L1@L2 = (6)                             @L22  = L11@L21  = L, and  p L2 =

p L22  =true.  So, the theorem holds for this case.




Task 2.4  (15%).  Do part  (b) of the correctness proof of Both(R1, R2).  Your proof should follow the  format  of the  proof given above.   You should  assume  P (R1 ), P (R2),  and  that match is total.  Then,  proving (b) is the same as proving the following theorem:


For all values L : char list and total  functions p : char list → bool,

If match (Both(R1, R2 )) L  p  = true, then  there  exist  L1 , L2   such  that   L  =

L1 @L2, L1  ∈ L (Both(R1 , R2)),  and p(L2) = true.


Do  this proof carefully! There is  a  plausible-looking, but incorrect, implemen- tation of  Both; this case  of the proof will  fail  if your code has  this bug.


3    Irregular Expressions


Turns out, our implementation of a regular expression matcher  can recognize more than  just regular languages1.  In this section, we will explore this idea further.


Task 3.1 (10%). Write the function halfmatch which takes two regular expressions, R1  and R2, a character  list, L, and evaluates  to true if and only if there  exist L1, L2  : char list such that:


  • L = L1 @ L2


  • length L1  = length  L2


  • L1 ∈ L (R1)  and L2  ∈ L (R2).


For this task,  we want to use the power of the regular expression system we have in place; you should  define your  answer  in terms of regular  expressions  and  match.  Your solution should not  be recursive,  and  you should not  define any  helper functions.   Do not  use list functions apart  from length.



Task 3.2 (0%).  Now, let irregular = halfmatch (Star(Char #“0”)) (Star(Char #“1”)). Then, L (irregular) = {0n1n  |  n ∈ N}. This language turns  out to be irregular.  (You may prove this fact in higher level courses like 15-251 or 15-453.)






























1 It is actually  Turing  complete.