Assignment #2 STA410H1F/2102H1F Solution

$35.00 $24.00

Instructions: Solutions to problems 1 and 2 are to be submitted on Blackboard (PDF les strongly preferred). You are strongly encouraged to do problems 3{5 but these are not to be submitted for grading.   An interesting variation of rejection sampling is the ratio of uniforms method. We start Z 1   by taking a…

You’ll get a: . zip file solution

 

 
Categorys:

Description

5/5 – (2 votes)

Instructions: Solutions to problems 1 and 2 are to be submitted on Blackboard (PDF les strongly preferred). You are strongly encouraged to do problems 3{5 but these are not to be submitted for grading.

 

  1. An interesting variation of rejection sampling is the ratio of uniforms method. We start Z 1

 

by taking a bounded function h with h(x)       0 for all x and              h(x) dx < 1. We then

1

de ne the region

 

q

Ch =  (u; v) : 0               u             h(v=u)

 

and generate (U; V ) uniformly distributed on Ch. We then de ne the random variable X = V =U.

 

(a) The joint density of (U; V ) is        
f(u; v) = 1   for (u; v) 2 Ch
     
jChj

 

where jChj is the area of Ch. Show that the joint density of (U; X) is

 

    u   for 0   u   q  
g(u; x) =     h(x)
  h  
  jC j  

 

and that the density of X is      h(x) for some     > 0.

 

  • The implementation of this method is somewhat complicated by the fact that it is

 

typically di cult to sample (U; V ) from a uniform distribution on Ch. However, it is usually possible to nd a rectangle of the form Dh = f(u; v) : 0 u u+; v v v+g such that

 

Ch is contained within Dh. Thus to draw (U; V ) from a uniform distribution on Ch, we can use rejection sampling where we draw proposals (U ; V ) from a uniform distribution on the rectangle Dh; note that the proposals U and V are independent random variables with Unif(0; u+) and Unif(v ; v+) distributions, respectively. Show that we can de ne u+, v and v+ as follows:

  + = x q       x q     + x q    
u h(x) v h(x) v h(x):
  max     = min x     = max x  

 

(Hint: It su ces to show that if (u; v) 2 Ch then (u; v) 2 Dh where Dh is de ned using u+, v , and v+ above.)

 

(c) Implement (in R) the method above for the standard normal distribution taking h(x) =

q                                                                     q

exp( x2=2). In this case, u+ = 1, v = 2=e = 0:8577639, and v+ = 2=e = 0:8577639. What is the probability that proposals are accepted?

 

 

1

 

  1. Suppose we observe y1; ; yn where

 

yii + “i            (i = 1;   ; n)

 

where f”ig is a sequence of random variables with mean 0 and nite variance representing noise. We will assume that 1; ; n are dependent in the sense that j i i 1j is small for most values of i. The least squares estimates of 1; ; n are trivial | bi = yi for all i

 

  • but we can modify least squares in a number of ways to accommodate the \smoothness” assumption on f i In this problem, we will consider estimating f ig by minimizing
n n
Xi (yi    i)2 +( i    i  1)2
X
=1 i=2
  n

where    > 0 is a tuning parameter. (The term           X( i    i  1)2 is a \roughness” penalty. In

            n     i=2          
                             
practice, it is more common to use j i    i  1j as it better estimates jumps in f ig.)
            =2                  
          Xi                  
(a) Show the estimates b1;   ; bn satisfy the equations          
y1   =  (1 +  ) 12   j              
y j =   j  1 b+ (1 b     j+1 (j = 2; ; n 1)
        + 2 )        
yn   bn  1     b b          
=   + (1 +  ) n:              
      b     b                

(Note that if we write this in matrix form y = A b, the matrix A is sparse.)

 

  • Show (using results from class) that both the Jacobi and Gauss-Seidel algorithms can be used to compute the estimates de ned in part (a).

 

  • Write a function in R to implement the Gauss-Seidel algorithm above. The inputs for this function are a vector of responses y and the tuning parameter lambda. Test your function (for various tuning parameters) on data generated by the following command:

 

> y <- c(rep(0,250),rep(1,250),rep(0,50),rep(1,450)) + rnorm(1000,0,0.1)

 

How does varying change the estimates, particularly the estimates of the transitions from 0 to 1 in the step function?

 

Supplemental problems (not to hand in):

 

  1. To generate random variables from some distributions, we need to generate two indepen-dent two independent random variables Y and V where Y has a uniform distribution over some nite set and V has a uniform distribution on [0; 1]. It turns out that Y and V can be generated from a single Unif(0; 1) random variable U.

 

 

 

 

2

 

(a) Suppose for simplicity that the nite set is f0; 1; ; n 1g for some integer n 2. For U Unif(0; 1), de ne

 

Y = bnUc and V = nU             Y

 

where bxc is the integer part of x. Show that Y has a uniform distribution on the set f0; 1; ; n 1g, V has a uniform distribution on [0; 1], and Y and V are independent.

(b) What happens to the precision of V de ned in part (a) as n increases? (For example, if

 

  • has 16 decimal digits and n = 106, how many decimal digits will V have?) Is the method in part (a) particularly feasible if n is very large?

 

  1. One issue with rejection sampling is a lack of e ciency due to the rejection of random variables generated from the proposal density. An alternative is the acceptance-complement (A-C) method described below.

 

Suppose we want to generate a continuous random variable from a density f(x) and that

 

f(x) = f1(x) + f2(x) (where both f1 and f2 are non-negative) where f1(x) g(x) for some density function g. Then the A-C method works as follows:

 

  1. Generate two independent random variables U Unif(0; 1) and X with density g.

 

  1. If U f1(X)=g(X) then return X.

 

  1. Otherwise (that is, if U > f1(X)=g(X)) generate X from the density

 

f2 (x) = f2 (x)
    :
Z 1  

f2(t) dt

 

1

 

Note that we must be able to easily sample from g and f2 in order for the A-C method to be e cient; in some cases, they can both be taken to be uniform distributions.

 

  • Show that the A-C method generates a random variable with a density f. What is the probability that the X generated in step 1 (from g) is \rejected” in step 2?

 

  • Suppose we want to sample from the truncated Cauchy density

 

f(x) = 2 (  1   x   1)
(1 + x2)

 

using the A-C method with f2(x) = k, a constant, for 1 x 1 (so that f2 (x) = 1=2 is a uniform density on [ 1; 1]) with

 

f1(x) = f(x)      f2(x) = f(x)      k             (  1         x             1):

 

If g(x) is also a uniform density on [ 1; 1] for what range of values of k can the A-C method be applied?

 

 

 

3

 

(c) De ning f1, f2, and g as in part (b), what value of k minimizes the probability that X generated in step 1 of the A-C algorithm is rejected?

 

  1. Suppose we want to generate a random variable X from the tail of a standard normal distribution, that is, a normal distribution conditioned to be greater than some constant b. The density in question is
f(x) = exp(  x2 =2) for x   b
  p            

2 (1 (b))

 

with f(x) = 0 for x < b where (x) is the standard normal distribution function. Consider rejection sampling using the shifted exponential proposal density

 

g(x) = b exp(  b(x       b))          for x      b.

 

  • De ne Y be an exponential random variable with mean 1 and U be a uniform random variable on [0; 1] independent of Y . Show that the rejection sampling scheme de nes X =

 

b + Y =b if

Y 2

2 ln(U)                                                                             b2 :

 

(Hint: Note that b + Y =b has density g.)

 

(b) Show the probability of acceptance is given by

p

2 b(1           (b)):

exp(  b2=2)

 

What happens to this probability for large values of b? (Hint: You need to evaluate M = max f(x)=g(x).)

 

(c) Suppose we replace the proposal density g de ned above by

 

g (x) = exp(    (x            b))          for x      b.

 

(Note that g is also a shifted exponential density.) What value of maximizes the proba-bility of acceptance? (Hint: Note that you are trying to solve the problem

 

min max f(x)

>0  x  b  g (x)

 

for . Because the density g (x) has heavier tails, the minimax problem above will have the same solution as the maximin problem

 

max min f(x)

x  b   >0  g (x)

 

which may be easier to solve.)

 

 

 

4