台大應數所(碩士班) 103年度一般考試 機率統計解答

題目: https://exam.lib.ntu.edu.tw/sites/default/files/exam/graduate/103/103061.pdf

1 . Let X_1, X_2, X_3 be a random sample from {\rm Po}(\lambda). Moreover, let Y_1 = X_1+X_3, Y_2=X_2+X_3 and Z_i = \mathbb{1}_{\{Y_i=0\}}. Compute the correlation between Z_1, Z_2.

(解答)

First, consider V[Z_1]. It is important to note that Z_1^2=Z_1. So we have

V[Z_1] = E[Z_1^2] - E[Z_1]^2 = E[Z_1] - E[Z_1]^2 = E[Z_1] (1-E[Z_1]).

Here,

E[Z_1] = P(Y_1=0)=P(X_1+X_3=0) = e^{-2\lambda}.

So we have:

V[Z_1] = e^{-2\lambda} (1-e^{-2\lambda}).

By a similar argument, we also have:

V[Z_2] = e^{-2\lambda} (1-e^{-2\lambda}).

So now it is sufficient to find Cov[Z_1, Z_2]:

Cov[Z_1,Z_2] = E[Z_1Z_2]-E[Z_1]E[Z_2] =P(Y_1=0, Y_2=0) - e^{-4\lambda}.

P(Y_1=0, Y_2=0) can further be rewritten as:
= P(X_1+X_3=0, X_2+X_3=0) = P(X_1=0, X_2=0, X_3=0) = e^{-3\lambda}.

Finally

\rho = \frac{Cov[Z_1,Z_2]}{V[Z_1]^{1/2} V[Z_2]^{1/2}} = \frac{e^{-3\lambda}-e^{-4\lambda}}{e^{-2\lambda} (1- e^{-2\lambda})} = \frac{e^{-\lambda}}{1+e^{-\lambda}}

 

2 . Let (X,Y)^\top be a bivariate random vector with finite variances. Show that
a.) Cov[X, Y-E[Y\mid X]]=0
b.) V[Y-E[Y\mid X]] = E[V[Y\mid X]]

(方法) 雙重期望值原理。i.e., EE[Y\mid X] = E[Y]

考慮機率空間(\Omega, \mathcal{F}, P)。令X,Y為隨機變數,也就是該機率空間上的可測函數。考慮\mathcal{F}的子\sigma代數\sigma[X] \stackrel{\rm def}{=} \{X^{-1}(B) \mid B \in \mathcal{B}:~{\rm Borel~family}\}.

在基於測度論的機率論上,條件期望值的定義如下:

E[Y \mid X]\sigma[X]可測函數(可以想成是g(X) with g: Borel function)使得對於任何A \in \sigma[X]滿足等式:

\int_{A} E[Y \mid X] dP = \int_{A} Y dP.

A = \Omega時,變成雙重期望值原理的等式。

(解答 a.) The left hand side is

Cov[X, Y-E[Y\mid X]] = Cov[X,Y]-E[X,E[Y \mid X]].
Note that E[X,E[Y \mid X]] is rewritten as:
= E[X \cdot E[Y \mid X]] - E[X] \cdot EE[Y\mid X] = EE[XY \mid X] - E[X]E[Y] = E[XY]-E[X]E[Y],
which is equal to Cov[X,Y]. And we have the desired result.

(解答 b.) The left hand side is:
V[Y-E[Y\mid X]] = E[(Y-E[Y\mid X])^2] - E[Y - E[Y \mid X]]^2 = E[(Y-E[Y\mid X])^2] - 0^2.

Furthermore, the right hand side in the above equation is:
= E[E[(Y-E[Y\mid X])^2 \mid X]] = E[V[Y \mid X]],
which is our desired conclusion.

3 . Let (X_1, Y_1), \ldots, (X_n, Y_n) are a random sample from a bivariate normal distribution with correlation \rho. Using the fact that \sqrt{n}(r-\rho) \stackrel{d}{\rightarrow} N(0, (1-\rho^2)^2), where r is a sample correlation coefficient, try to find g(r) which converges to a normal distribution with constant variance.

(方法) 這個叫做variance stabilizing transformation。我們利用\delta-method以及解個簡單的微分方程即可。

(解答) Let g: \mathbb{R} \mapsto \mathbb{R} be a differentiable function. By the \delta-method, we have

\sqrt{n}(g(r) - \rho) \stackrel{d}{\rightarrow} N(0, (1-\rho^2)^2 \cdot g'(\rho)^2).

Since (1-\rho^2)^2 \cdot g'(\rho)^2 is irrelevant to \rho, we, for example, may assume that:

(1-\rho^2)^2 \cdot g'(\rho)^2 = 1,

thus, one of the solutions is:

g'(\rho) = \frac{1}{1-\rho^2} = \frac{1}{2}(\frac{1}{1-\rho}+\frac{1}{1+\rho}).

By solving the differential equation, we can eventually consider g(\rho) = \frac{1}{2} \ln\left(\frac{1+\rho}{1-\rho}\right), which satisfies the requirement for the problem.

4 . Let X_1, \ldots, X_n be a random sample from a population with a probability density function f(x\mid \theta) = \theta x^{\theta-1} \mathbb{1}_{(0,1)}(x). Find the UMVUE for \theta.

(解答) 見102年度一般考試解答

5 . Let X_1, \ldots, X_n be a random sample from N(\theta, \sigma^2). Find a size \alpha unbiased test for hypotheses: H_0: \theta \in [\theta_1, \theta_2] vs H_1: \theta \notin [\theta_1, \theta_2].

(解答) 見102年度一般考試解答

6 . Let X_1, \ldots, X_n be a random sample from {\rm Beta}(\theta,1) and \theta have a marginal distribution {\rm Gamma}(\alpha, \beta) where \alpha,\beta are known constants. Find the Bayes estimator for \theta under the squared error loss function.

(方法) 求\theta後驗期望值。

(解答) Let \delta(\bm{x}) be an estimator. We are required to find \delta(\bm{x}) s.t the average risk:

R(\delta) \stackrel{\rm def}{=} E[E[(\delta(\bm{x})-\theta)^2\mid \theta]_{x_1,\ldots, x_n \sim {\rm Beta(\theta,1)}}]_{\theta\sim{\rm Gamma(\alpha,\beta)}}

is minimized. By swapping the order of the integral, the above equation can be written as follows:

E[E[(\delta(\bm{x})-\theta)^2\mid \bm{x}]_{\theta \sim \pi(\theta\mid \bm{x})}]_{\bm{x} \sim m(\bm{x})}

In the above equation, let us consider the minimization of the conditional expectation given \bm{x}:

E[(\delta(\bm{x})-\theta)^2\mid \bm{x}].

This can be rewritten as:
E[(\delta(\bm{x})-E[\theta\mid\bm{x}]+E[\theta\mid\bm{x}]-\theta)^2\mid \bm{x}],

which can further be rearranged as:

= E[(\delta(\bm{x})-E[\theta\mid\bm{x}])^2\mid \bm{x}]+ E[(\theta-E[\theta\mid\bm{x}])^2\mid \bm{x}].

Note that E[(\theta-E[\theta\mid\bm{x}])^2\mid \bm{x}] = V[\theta\mid\bm{x}].

So we have we rewrite it as

= E[(\delta(\bm{x})-E[\theta\mid\bm{x}])^2\mid \bm{x}]+ V[\theta \mid \bm{x}] \geq V[\theta\mid\bm{x}].

In the above inequality, the equality holds when \delta(\bm{x}) = E[\theta\mid\bm{x}] (a.s.).

Therefore, the Bayes estimator under the squared error loss is the posterior mean. Now we consider the posterior distribution of \theta. Note that

\pi(\theta\mid \bm{x}) \approx f(\bm{x}\mid \theta) \pi(\theta),

where \approx means that the equation is equal if we only focus on the part \theta.

The right hand side is:
\approx \theta^n (x_1 \cdots x_n)^\theta \cdot \theta^{\alpha-1} \exp(-\beta \theta),

which can be rearranged as:

= \theta^{n+\alpha-1} \exp(-(\beta-\sum_{i=1}^n \ln x_i)).

This implies that the posterior distribution is {\rm Gamma}(n+\alpha, \beta-\sum_{i=1}^n \ln x_i).

So its posterior mean is \frac{n + \alpha}{\beta-\sum_{i=1}^n \ln x_i}. And this is the desired Bayes estimator.

7 . Let X \sim f(x) and generate Y_1, \ldots, Y_n \stackrel{\rm iid}{\sim} g(y), where f, g are probability density functions. Given Y_1, \ldots, Y_n, define a random variable P(X^*=Y_i \mid Y_1, \ldots, Y_n) = q_i where
q_i = \frac{f(Y_i)/g(Y_i)}{\sum_{j=1}^n f(Y_j)/g(Y_j)}.
Show that P(X^* \leq x \mid Y_1, \ldots, Y_n) \stackrel{P}{\rightarrow} P(X \leq x).

(方法) 利用大數法則

(解答) First, it is important to note that P(X^* \leq x \mid Y_1, \ldots, Y_n) is equal to the sum of \{q_1, \ldots, q_n\} where Y_i \leq x. So we have

P(X^* \leq x \mid Y_1, \ldots Y_n) \stackrel{\rm eq}{=} \sum_{i=1}^n q_i \mathbb{1}(Y_i \leq x),
which can further be rearranged as:

= \frac {\frac{1}{n}\sum_{i=1}^n f(Y_i)/g(Y_i) \cdot \mathbb{1}(Y_i \leq x)} {\frac{1}{n}\sum_{i=1}^n f(Y_i)/g(Y_i)}.

Let A_i \stackrel{\rm def}{=} f(Y_i)/g(Y_i) \cdot \mathbb{1}(Y_i \leq x) and B_i \stackrel{\rm def}{=} f(Y_i)/g(Y_i).

By the Strong Law of Large Numbers,
\frac{1}{n} \sum_{i=1}^n A_i \stackrel{a.s.}{\rightarrow} E[f(Y)/g(Y) \mathbb{1}(Y \leq x)]_{Y \sim g},
which is equal to:
= \int f(y)/g(y) \mathbb{1}(y \leq x) \cdot g(y)dy = \int_{-\infty}^x f(y)dy = P(X \leq x).

And similarly, we also have:
\frac{1}{n} \sum_{i=1}^n B_i \stackrel{a.s.}{\rightarrow} E[f(Y)/g(Y)]_{Y \sim g} = \int f(y)/g(y) \cdot g(y)dy = \int f(y) dy = 1.

So we have P(X^* \leq x \mid Y_1,\ldots, Y_n) \stackrel{a.s.}{\rightarrow} P(X \leq x), which also implies convergence in probability.

タイトルとURLをコピーしました