Homework #5 Solution



1* Given two strings X = x1x2 : : : xm, Y = y1y2 : : : yn of integers, a common interlacing subse-quence is a sequence of indices (i1; i2; : : : ; ik) and (j1; j2; : : : ; jk), 1 i1 < i2 < < ik m and 1 j1 < j2 < < jk n such that xi1 < yj1 < xi2 < yj2 < < xik < yjk .

Give a polynomial-time algorithm that given X; Y as inputs nds a longest common inter-lacing subsequence between X and Y . You don’t have to prove correctness or analyze the time-complexity of the algorithm.

For instance, if X = [1; 3; 2; 5] and Y = [2; 6; 4; 7], then if you take i1 = 1; i2 = 2; i3 = 4 and j1 = 1; j2 = 3; j3 = 4, you get an interlacing subsequence|1 < 2 < 3 < 4 < 5 < 7|of length 6. In fact, in this case 6 is the best possible answer. [.75 points]

[Hint: One approach is to sub-problems like we did for edit distance (with one modi cation) and then try to develop a recurrence relation based on a case-analysis of what the last two symbols of the interlacing sequence would be.]

  • Consider the complete rooted quaternary tree of depth n. That is take a rooted tree where the root has four children, each child has exactly 4 children and so on for n levels. So for example for n = 1, you have the root connected to its four children (with 5 nodes in total);

for n = 2 you have 21 nodes in total; and more generally, the total number of nodes in the tree of depth n is exactly 1 + 4 + 42 + + 4n = (4n+1 1)=3.

Now, suppose that you delete each edge of the tree independently with probability 1=2. Let X be the number of nodes which are still connected to the root after the deletion of the edges. Compute the expectation of X (with proof). [.75 points]

[Hint: One clean approach is to write X as a sum of `simpler’ random variables and use linearity of expectation to compute the expectation.]

  • Suppose you are given n apples and n oranges. The apples are all of di erent weights and all the oranges have di erent weight. However, for each apple there is a corresponding orange of the same weight and vice versa. You are also given a weighing machine that (counter to common intuition) will only compare apples to oranges: that is, if you feed an apple and an orange to the machine it will tell you whether the apple or the orange is heavier or if they have the same weight. Your goal is to use this machine to pair up each apple with the orange of the same weight with as few uses of the machine as possible.

Give a randomized algorithm to determining the pairing. For full-credit, your algorithm should only use the machine O(n log n) times in expectation. You do not need to prove correctness or analyze the time-complexity of the algorithm. [.75 points]

(Remember that you cannot compare an apple to an apple or an orange to an orange.)

  • Suppose you are writing a plagiarism detector. Students submit documents as part of a home-work and each document is an (ordered) sequence of words. For some parameter m decided by the provost, we say two documents are copies of each other if one of them uses a sequences

of m words (in that given order) from the other. Give an algorithm which given an integer m and N documents D1; : : : ; DN as input, ags all submissions which are copies of some other

submission. Your algorithm should run in expected O(m + N + total-length of documents) time (i.e., expected O(m + N + n1 + : : : + nN ) time where nj is the length of j’th document) and be always correct. [.75 points]

Additional Problems. DO NOT turn in answers for the following problems – they are meant for your curiosity and understanding.

  1. There are four types of brackets: (, ), <, and >. We de ne what it means for a string made up of these four characters to be well-nested in the following way:

    1. The empty string is well-nested.

    1. If A is well-nested, then so are <A> and (A).

    1. If S, T are both well-nested, then so is their concatenation ST.

For example, (), < () >, ((< >)), () <> are all well-nested. Meanwhile, (, <>), )(, (<)>, and (<(> are not well-nested.

Devise an algorithm that takes as input a string s = (s1; s2; :::; sn) of length n made up of these four types of characters. The output should be the length of the shortest well-nested string that contains s as a subsequence. For example, if (<(> is the input, then the answer is 6; a shortest string containing s as a subsequence is ()<()>. For full-credit, your algorithm should run in time O(n3) but you don’t have to prove its correctness or analyze the time complexity.

(Hint: For this problem, try to use sub-problems that solve the problem for substrings of the

original s: (si; si+1; : : : ; sj) for all i < j.)

  1. Call a sequence of coin tosses \monotone” if the sequence never changes from Heads to Tails when parsed left to right. For example, the sequences T T T HHHH, T T T T are monotone

whereas T T T HHHT is not. Consider a sequence of n coin tosses of a fair coin. For integer k > 0, let Yk be the number of monotone sub-sequences of length k in the n coin tosses. Compute the expectation of Yk (with proof).

Recall that a sequence x is a sub-sequence of y if x can be obtained from y by deleting some symbols of y without changing the order of the remaining elements.

3. De ne a random graph G = (V; E) as follows. The vertex set of G is V = f1; : : : ; ng. Now for each pair fi; jg with i 6= j 2 [n], add the pair fi; jg to E with probability 1=2 independent of the choice for every other pair. Let X5 be the number of independent sets of size 5 in the graph G. Compute the expectation of X5 (with proof).

Recall that an independent set is a collection of vertices I in G such that no two vertices in I have an edge between them.