Homework 1 Solution

$30.00

Category: Tag:

Description

Instructions: Solutions to problems 1 and 2 are to be submitted via Quercus and only PDF les will be accepted. You are strongly encouraged to do problems 3 and 4 but these are not to be submitted for grading.

1. Suppose that Z is an m n pixel image where m = 2k and n = 2. If Hm and Hn are m m and n n Hadamard matrices then we can de ne the Walsh-Hadamard tranform of

  • by

Zb = HmZHn:

In this problem, we will explore using this Walsh-Hadamard transform to denoise an image.

  1. Show that Z = HmZHbn=(mn). (This gives the inverse Walsh-Hadamard transform.)

  1. To denoise an image using the Walsh-Hadamard transform, we rst compute Zb and then perform hard- or soft-thresholding to obtain a thresholded (or shrunken) transform Zb (which will typically be \sparser” than Zb). Our hope is that the components of Zb corresponding to noise in the image are eliminated and so the denoised image can be obtained by applying the inverse transform to Zb .

Using an R function fwht2d (available on Quercus) to compute the Walsh-Hadamard trans-form, write R functions to perform both hard- and soft-thresholding (dependent on a pa-rameter ).

  1. The le boats.txt contains a noisy 256 256 pixel grayscale image of sailboats. Its entries are numbers between 0 and 1 where 0 represents black and 1 represents white. The data can be read into R and displayed as an image as follows:

  • boats <- matrix(scan(“boats.txt”),ncol=256,byrow=T)

  • image(boats, axes=F, col=grey(seq(0,1,length=256)))

Using your functions from part (b), try to denoise the image as best as possible. (This is quite subjective but try di erent methods and parameter values.)

Note: You should not expect your noise-reduced image to be a drastic improvement over noisy image; in fact, connaisseurs of pointillism may nd prefer the noisy image. There are two issues here: First, we are applying noise reduction to the whole image rather than dividing the image into smaller sub-images and applying noise reduction to each sub-image. Second, even very good noise reduction tend to wash out ner details, thereby rendering the noise-reduced image less visually appealing.

  1. Suppose that X1; X2; is an in nite sequence of independent identically distributed random variables with some distribution function F and N is a Poisson distribution with mean that is independent of the sequence fXig. Then we can de ne a compound Poisson

random variable Y by

N

X

Y = Xi

i=1

where Y = 0 if N = 0. This distribution arises naturally in risk theory and insurance { for example, if N represents the number of claims and X1; X2; the amounts paid for each claim then Y is the total sum paid. For the purposes of risk management, it is useful to know the distribution of Y , particularly its tail.

  1. Suppose that fXig are discrete integer-valued random variables with probability gener-ating function (s) = E(sXi ). Show that the probability generating function of Y is g( (s)) where g(s) is the probability generating function of N, which is given by

g(s) = exp( (1 s)):

  1. The distribution of Y can be approximated using the Fast Fourier Transform. Assume that the random variables fXig have a distribution p(x) on the integers 0; 1; ; ‘. The complication is that, unlike the distribution of S = X1 + + Xn, the distribution of Y is not concentrated on a nite set of integers. Therefore, we need to nd an integer M such

that P (Y M) is smaller than some pre-determined threshold ; M will depend on the Poisson mean as well as the integer ‘ (or more precisely, the distribution p(x)). (Also to optimize the FFT algorithm, M should be a power of 2 although this isn’t absolutely

necessary unless M is very large.) Show that if P (N m) then P (Y m‘) and so we can take M m‘.

  1. The bound M determined in part (b) is typically very conservative and can be decreased substantially. One approach to determining a better bound is based on the probability

generating function of Y derived in part (a) and Markov’s inequality. Speci cally, if s > 1 we have

P (Y M) = P (sY sM )

E(sY )

exp(

(1

(s)))

=

:

sM

sM

Use this fact to show that for P (Y M) < , we can take

M = inf

ln( )

(1

(s))

:

ln(s)

s>1

  1. Given M (which depends on ), the algorithm for determining the distribution of Y goes as follows:

1. Evaluate the DFT of fp(x) : x = 0; ; M

1g:

M 1

j

p(j) =

x=0 exp

p(x)

2 M x

b

X

where p(‘ + 1) = = p(M 1) = 0.

2. Evaluate g(pb(j)) = exp( (1 pb(j))) for j = 0; ; M 1.

3. Evaluate P (Y = y) by computing the inverse FFT:

P (Y = y) =

1 M 1

exp

y

g(p(j))

M j=0

2 M j

X

b

Write an R function to implement this algorithm where M is determined using the method in part (c) with = 10 5. Use this function to evaluate the distribution of Y in the case where = 3 and the distribution of fXig is

11

x

p(x) = P (Xi = x) =

for x = 0; ; 10.

66

(You do not need to evaluate the bound M from part (c) with great precision; for example, a simple approach is to take a discrete set of points S = f1 < s1 < s2 < < skg and de ne

ln( ) (1 (s))

M = min

s2S ln(s)

where = si+1 si and sk are determined graphically (that is, by plotting the appropriate function) so that you are convinced that the value of M is close to the actual in mum.

On Blackboard, there is a paper by Embrechts and Frei \Panjer recursion versus FFT for compound distributions”, which (not surprisingly) compares the FFT approach to alterna-tive approach (Panjer recursion). This paper includes some R code for the FFT algorithm described above and discusses some modi cations to the basic FFT algorithm { feel free to use this R code as a template for your function. (Feel free to explore some of the other aspects of this paper for your own edi cation.)

Supplemental problems (not to hand in):

  1. An alternative to the Quicksort algorithm is Mergesort (see p. 122 of the text). The Mergesort algorithm works by merging pairs of ordered lists. If the two lists contain r and

s elements, respectively, then the number of comparisons needed to merge the two lists into single list lies between min(r; s) and r + s 1.

De ne C(n) to be the (random) number of comparisons in the Mergesort algorithm. It can be shown that the expected number of comparisons, E[C(n)] satis es the recursive formula

E[C(n)] = E(Zn) + E[C(bn=2c)] + E[C(dn=2e)]

where bn=2c = dn=2e = n=2 if n is even with bn=2c = (n

1)=2, dn=2e = (n + 1)=2 if n is

odd, and Zn is a random variable whose distribution is

P (Zn = k) =

bn=2c

1 1!

+

dn=2e

1

1

!

for

n=2 k

n 1

k

k

b

c

n=2

!

b

n

c

and whose expected value is

E(Zn) = bn=2cdn=2e

1

+

1

!

b

n=2

+ 1

d

n=2 + 1

c

e

(Zn represents the number of comparisons needed to merge two ordered lists of lengths bn=2c and dn=2e, respectively.)

(a) In class, we derived the following recursive formula for E[C(n)] for the Quicksort algo-

rithm:

n

2

kX

E[C(n)] = n 1 +

n

E[C(k 1)]

=1

where E[C(0)] = E[C(1)] = 0. Using R, compute E[C(n)] for n = 1; ; 1000 and compare the average performance of Quicksort to that of Mergesort.

(b) Calculate the worst case behaviour of C(n) for Mergesort.

4. As noted in lecture, catastrophic cancellation in the subtraction x y can occur when x

and y are subject to round-o error. Speci cally, if (x) = x(1 + u) and (y) = y(1 + v) then

(x) (y) = x y + (xu yv)

where the absolute error jxu yvj can be very large if both x and y are large; in some cases, this error may swamp the object we are trying to compute, namely x y, particularly if

jx yj is relatively small compared to jxj and jyj. For example, if we compute the sample variance using the right hand side of the identity

n

n

n

n

!

2

(1)

(xi x)2 =

xi2

xi

;

X

X

1

Xi

i=1

i=1

=1

a combination of round-o errors from the summations and catastrophic cancellation in the subtraction may result in the computation of a negative sample variance! (In older versions of Microsoft Excel, certain statistical calculations were prone to this unpleasant phenomenon.) In this problem, we will consider two algorithms for computing the sample variances that avoid this catastrophic cancellation. Both are \one pass” algorithms, in the sense that we only need to cycle once through the data (as is the case if we use the right hand side of (1)); to use the left hand side of (1), we need two passes since we need to rst compute x before computing the sum on the left hand side of (1). In parts (a) and (b) below, de ne xk be the sample mean of x1; ; xk and note that

k 1

xk+1 = k + 1xk + k + 1xk+1

with x = xn.

n

Xi

x)2 can be computed using the recursion

(a) Show that (xi

=1

k+1

k

k

Xi

(xi

xk+1)2

= (xi xk)2

+

(xk+1

xk)2

k + 1

X

=1

i=1

for k = 1; ; n 1. (This is known as West’s algorithm.)

  1. A somewhat simpler one-pass method replaces x by some estimate x0 and then corrects for the error in estimation. Speci cally, if x0 is an arbitrary number, show that

n

n

X

(xi

x)2 =

X

(xi

x0)2

n(x0

x)2:

i=1

i=1

  1. The key in using the formula in part (b) is to choose x0 to avoid catastrophic cancellation, that is, x0 should be close to x. How might you choose x0 (without rst computing x) to minimize the possibility of catastrophic cancellation? Ideally, x0 should be calculated using

o(n) operations.

(An interesting paper on computational algorithms for computing the variance is \Algorithms for computing the sample variance: analysis and recommendations” by Chan, Golub, and LeVeque; this paper is available on Quercus.)