Your cart is currently empty!
Nature’s rejection sampler, 30 points. A light source emits photons whose angle x is normally distributed around : p(xj ) = N (xj = ; 2 = 42) Then the photons pass through a grating which absorbs photons with a probability that depends on their angle. We denote the event that a photon…
p(xj ) = N (xj = ; 2 = 42)
Then the photons pass through a grating which absorbs photons with a probability that depends on their angle. We denote the event that a photon makes it through the grating by saying that the Bernoulli variable g = 1. For this particular grating, p(gjx; ) = Ber(gjf(x; )) where
(1.1) | f(x; ) = | sin2(5(x )) | : |
25 sin2(x ) |
The resulting unnormalized probability p(x; g = 1j = 0), is shown below as a function of x.
the fraction of photons that get absorbed on average, p(g = 0j = 0). That is, estimate
R
p(x; g = 0j = 0)dx by summing over 10,000 values of x, equally spaced from 20 to 20.
it’s a pmf, p(gjx; = 0) 1. Another hint: doing ancestral sampling in the above model forms a rejection sampler. Plot a histogram 10,000 accepted samples, and print the fraction of accepted samples.
photons that get absorbed, p(g = 0j = 0). Use p(xj = 0) as a proposal distribution. As a reminder, the formula for a self-normalized importance sampler for the expectation of a function f over an (unnormalized) target distribution p~(z) and proposal distribution q(z)
K | p~(xi) | ||||||||||||
1 | Xi | ||||||||||||
q(xi) | wherexi iid q(x) | ||||||||||||
(1.2) | e^(x1; x2; : : : ; xK ) = K | f(xi) | K | P | j=1 | q(xj) | |||||||
=1 | 1 | K | p~(xj) | ||||||||||
If you can derive a simpler expression by canceling terms, you don’t have to implement the entire formula. Print the estimate of the fraction of photons that are absorbed, using 1000 samples.
(d) [4 points] A physicist is trying to estimate the location of the center of a light source.
She has a Cauchy prior p( ) = 1 . She observes a photon emitted from the particle
10 (1+( 10 )2)
at position x = 1:7. Plot the unnormalized density p(x = 1:7; g = 1; ), as a function of .
The standard REINFORCE, or score-function estimator is de ned as:
(2.1) | g^SF[f] = f(b) | @ | log p(bj ); | b p(bj ) |
@ |
The takeaway is that you can use a baseline to reduce the variance of REINFORCE, but not one that depends on the current action.
we’ll look at which gradient estimators scale to large numbers of parameters, by computing their variance as a function of dimension.
For simplicity, we’ll consider a toy problem. The goal will be to estimate the gradients of the expectation of a sum of D independent one-dimensional Gaussians. Each Gaussian has unit variance, and its mean is given by an element of the D-dimensional parameter vector :
D | |||
Xd | |||
(3.1) | f(x) = xd | ||
=1 | “ D | xd# | |
(3.2) | L( ) = Ex p(xj )[f(x)] = Ex iidN( ;I) | ||
Xd | |||
=1 |
D | |
Xd | |
(3.3) | ^ |
LMC = xd;where each xd iid N ( d; 1) | |
=1 |
^
That is, compute V LMC as a function of D.
(b) [5 points] The score-function, or REINFORCE estimator with a baseline has the form:
(3.4) | g^SF[f] = [f(x) c( )] | @ | log p(xj ); | x p(xj ) |
@ |
Derive a closed form for this gradient estimator for the objective de ned above as a de-terministic function of , a D-dimensional vector of standard normals. Set the baseline to
D | @ | log p(xj ), you shouldn’t take the derivative | ||||
c( ) = | d=1 d. Hint: When simplifying | |||||
@ | ||||||
x, even if it depends on . To help keep track of what is being di erentiated, you | ||||||
throughP | @ | |||||
can use the notation | g( ; ) to denote taking the derivative only w.r.t. the second . | |||||
@
dimensional vectors, their covariance is a D D matrix. To make things easier, we’ll consider only the variance of the gradient with respect to the rst element of the parameter vector, 1. That is, derive the scalar value V g^1SF as a function of D. Hint: The third moment of a standard normal is 0, and the fourth moment is 3. As a sanity check, consider the case where D = 1.
(3.5) | g^REPARAM[f] = | @f @x | ; | p( ) | ||
@x @ | ||||||
In this case, we can use the reparameterization x = + , with N (0; I). Derive this gradient estimator for r L( ), and give V g^1REPARAM as a function of D.
3