台大應數所(碩士班) 103年度一般考試機率統計解答

題目: https://exam.lib.ntu.edu.tw/sites/default/files/exam/graduate/103/103061.pdf

1 . Let $X_1, X_2, X_3$ be a random sample from ${\rm Po}(\lambda)$ . Moreover, let $Y_1 = X_1+X_3$ , $Y_2=X_2+X_3$ and $Z_i = \mathbb{1}_{\{Y_i=0\}}$ . Compute the correlation between $Z_1, Z_2$ .

(解答)

First, consider $V[Z_1]$ . It is important to note that $Z_1^2=Z_1$ . So we have

V[Z_1] = E[Z_1^2] - E[Z_1]^2 = E[Z_1] - E[Z_1]^2 = E[Z_1] (1-E[Z_1]).

Here,

E[Z_1] = P(Y_1=0)=P(X_1+X_3=0) = e^{-2\lambda}.

So we have:

V[Z_1] = e^{-2\lambda} (1-e^{-2\lambda}).

By a similar argument, we also have:

V[Z_2] = e^{-2\lambda} (1-e^{-2\lambda}).

So now it is sufficient to find $Cov[Z_1, Z_2]$ :

Cov[Z_1,Z_2] = E[Z_1Z_2]-E[Z_1]E[Z_2] =P(Y_1=0, Y_2=0) - e^{-4\lambda}.

$P(Y_1=0, Y_2=0)$ can further be rewritten as:
$= P(X_1+X_3=0, X_2+X_3=0) = P(X_1=0, X_2=0, X_3=0) = e^{-3\lambda}.$

Finally

\rho = \frac{Cov[Z_1,Z_2]}{V[Z_1]^{1/2} V[Z_2]^{1/2}} = \frac{e^{-3\lambda}-e^{-4\lambda}}{e^{-2\lambda} (1- e^{-2\lambda})} = \frac{e^{-\lambda}}{1+e^{-\lambda}}

2 . Let $(X,Y)^\top$ be a bivariate random vector with finite variances. Show that
a.) $Cov[X, Y-E[Y\mid X]]=0$
b.) $V[Y-E[Y\mid X]] = E[V[Y\mid X]]$

(方法) 雙重期望值原理。i.e., $EE[Y\mid X] = E[Y]$

考慮機率空間 $(\Omega, \mathcal{F}, P)$ 。令 $X,Y$ 為隨機變數，也就是該機率空間上的可測函數。考慮 $\mathcal{F}$ 的子 $\sigma$ 代數 $\sigma[X] \stackrel{\rm def}{=} \{X^{-1}(B) \mid B \in \mathcal{B}:~{\rm Borel~family}\}$ .

在基於測度論的機率論上，條件期望值的定義如下：

$E[Y \mid X]$ 為 $\sigma[X]$ 可測函數(可以想成是 $g(X)$ with $g$ : Borel function)使得對於任何 $A \in \sigma[X]$ 滿足等式:

$\int_{A} E[Y \mid X] dP = \int_{A} Y dP$ .

當 $A = \Omega$ 時，變成雙重期望值原理的等式。

(解答 a.) The left hand side is

$Cov[X, Y-E[Y\mid X]] = Cov[X,Y]-E[X,E[Y \mid X]].$
Note that $E[X,E[Y \mid X]]$ is rewritten as:
$= E[X \cdot E[Y \mid X]] - E[X] \cdot EE[Y\mid X] = EE[XY \mid X] - E[X]E[Y] = E[XY]-E[X]E[Y],$
which is equal to $Cov[X,Y]$ . And we have the desired result.

(解答 b.) The left hand side is:
$V[Y-E[Y\mid X]] = E[(Y-E[Y\mid X])^2] - E[Y - E[Y \mid X]]^2 = E[(Y-E[Y\mid X])^2] - 0^2.$

Furthermore, the right hand side in the above equation is:
$= E[E[(Y-E[Y\mid X])^2 \mid X]] = E[V[Y \mid X]],$
which is our desired conclusion.

3 . Let $(X_1, Y_1), \ldots, (X_n, Y_n)$ are a random sample from a bivariate normal distribution with correlation $\rho$ . Using the fact that $\sqrt{n}(r-\rho) \stackrel{d}{\rightarrow} N(0, (1-\rho^2)^2)$ , where $r$ is a sample correlation coefficient, try to find $g(r)$ which converges to a normal distribution with constant variance.

(方法) 這個叫做variance stabilizing transformation。我們利用 $\delta$ -method以及解個簡單的微分方程即可。

(解答) Let $g: \mathbb{R} \mapsto \mathbb{R}$ be a differentiable function. By the $\delta$ -method, we have

\sqrt{n}(g(r) - \rho) \stackrel{d}{\rightarrow} N(0, (1-\rho^2)^2 \cdot g'(\rho)^2).

Since $(1-\rho^2)^2 \cdot g'(\rho)^2$ is irrelevant to $\rho$ , we, for example, may assume that:

(1-\rho^2)^2 \cdot g'(\rho)^2 = 1,

thus, one of the solutions is:

g'(\rho) = \frac{1}{1-\rho^2} = \frac{1}{2}(\frac{1}{1-\rho}+\frac{1}{1+\rho}).

By solving the differential equation, we can eventually consider $g(\rho) = \frac{1}{2} \ln\left(\frac{1+\rho}{1-\rho}\right)$ , which satisfies the requirement for the problem.

4 . Let $X_1, \ldots, X_n$ be a random sample from a population with a probability density function $f(x\mid \theta) = \theta x^{\theta-1} \mathbb{1}_{(0,1)}(x)$ . Find the UMVUE for $\theta$ .

(解答) 見102年度一般考試解答

5 . Let $X_1, \ldots, X_n$ be a random sample from $N(\theta, \sigma^2)$ . Find a size $\alpha$ unbiased test for hypotheses: $H_0: \theta \in [\theta_1, \theta_2]$ vs $H_1: \theta \notin [\theta_1, \theta_2]$ .

(解答) 見102年度一般考試解答

6 . Let $X_1, \ldots, X_n$ be a random sample from ${\rm Beta}(\theta,1)$ and $\theta$ have a marginal distribution ${\rm Gamma}(\alpha, \beta)$ where $\alpha,\beta$ are known constants. Find the Bayes estimator for $\theta$ under the squared error loss function.

(方法) 求 $\theta$ 後驗期望值。

(解答) Let $\delta(\bm{x})$ be an estimator. We are required to find $\delta(\bm{x})$ s.t the average risk:

R(\delta) \stackrel{\rm def}{=} E[E[(\delta(\bm{x})-\theta)^2\mid \theta]_{x_1,\ldots, x_n \sim {\rm Beta(\theta,1)}}]_{\theta\sim{\rm Gamma(\alpha,\beta)}}

is minimized. By swapping the order of the integral, the above equation can be written as follows:

E[E[(\delta(\bm{x})-\theta)^2\mid \bm{x}]_{\theta \sim \pi(\theta\mid \bm{x})}]_{\bm{x} \sim m(\bm{x})}

In the above equation, let us consider the minimization of the conditional expectation given $\bm{x}$ :

E[(\delta(\bm{x})-\theta)^2\mid \bm{x}].

This can be rewritten as:
$E[(\delta(\bm{x})-E[\theta\mid\bm{x}]+E[\theta\mid\bm{x}]-\theta)^2\mid \bm{x}],$

which can further be rearranged as:

= E[(\delta(\bm{x})-E[\theta\mid\bm{x}])^2\mid \bm{x}]+ E[(\theta-E[\theta\mid\bm{x}])^2\mid \bm{x}].

Note that $E[(\theta-E[\theta\mid\bm{x}])^2\mid \bm{x}] = V[\theta\mid\bm{x}]$ .

So we have we rewrite it as

= E[(\delta(\bm{x})-E[\theta\mid\bm{x}])^2\mid \bm{x}]+ V[\theta \mid \bm{x}] \geq V[\theta\mid\bm{x}].

In the above inequality, the equality holds when $\delta(\bm{x}) = E[\theta\mid\bm{x}]$ (a.s.).

Therefore, the Bayes estimator under the squared error loss is the posterior mean. Now we consider the posterior distribution of $\theta$ . Note that

\pi(\theta\mid \bm{x}) \approx f(\bm{x}\mid \theta) \pi(\theta),

where $\approx$ means that the equation is equal if we only focus on the part $\theta$ .

The right hand side is:
$\approx \theta^n (x_1 \cdots x_n)^\theta \cdot \theta^{\alpha-1} \exp(-\beta \theta),$

which can be rearranged as:

= \theta^{n+\alpha-1} \exp(-(\beta-\sum_{i=1}^n \ln x_i)).

This implies that the posterior distribution is ${\rm Gamma}(n+\alpha, \beta-\sum_{i=1}^n \ln x_i)$ .

So its posterior mean is $\frac{n + \alpha}{\beta-\sum_{i=1}^n \ln x_i}$ . And this is the desired Bayes estimator.

7 . Let $X \sim f(x)$ and generate $Y_1, \ldots, Y_n \stackrel{\rm iid}{\sim} g(y)$ , where $f, g$ are probability density functions. Given $Y_1, \ldots, Y_n$ , define a random variable $P(X^*=Y_i \mid Y_1, \ldots, Y_n) = q_i$ where
$q_i = \frac{f(Y_i)/g(Y_i)}{\sum_{j=1}^n f(Y_j)/g(Y_j)}.$
Show that $P(X^* \leq x \mid Y_1, \ldots, Y_n) \stackrel{P}{\rightarrow} P(X \leq x)$ .

(方法) 利用大數法則

(解答) First, it is important to note that $P(X^* \leq x \mid Y_1, \ldots, Y_n)$ is equal to the sum of $\{q_1, \ldots, q_n\}$ where $Y_i \leq x$ . So we have

$P(X^* \leq x \mid Y_1, \ldots Y_n) \stackrel{\rm eq}{=} \sum_{i=1}^n q_i \mathbb{1}(Y_i \leq x),$
which can further be rearranged as:

= \frac {\frac{1}{n}\sum_{i=1}^n f(Y_i)/g(Y_i) \cdot \mathbb{1}(Y_i \leq x)} {\frac{1}{n}\sum_{i=1}^n f(Y_i)/g(Y_i)}.

Let $A_i \stackrel{\rm def}{=} f(Y_i)/g(Y_i) \cdot \mathbb{1}(Y_i \leq x)$ and $B_i \stackrel{\rm def}{=} f(Y_i)/g(Y_i)$ .

By the Strong Law of Large Numbers,
$\frac{1}{n} \sum_{i=1}^n A_i \stackrel{a.s.}{\rightarrow} E[f(Y)/g(Y) \mathbb{1}(Y \leq x)]_{Y \sim g},$
which is equal to:
$= \int f(y)/g(y) \mathbb{1}(y \leq x) \cdot g(y)dy = \int_{-\infty}^x f(y)dy = P(X \leq x).$

And similarly, we also have:
$\frac{1}{n} \sum_{i=1}^n B_i \stackrel{a.s.}{\rightarrow} E[f(Y)/g(Y)]_{Y \sim g} = \int f(y)/g(y) \cdot g(y)dy = \int f(y) dy = 1.$

So we have $P(X^* \leq x \mid Y_1,\ldots, Y_n) \stackrel{a.s.}{\rightarrow} P(X \leq x)$ , which also implies convergence in probability.