\[ \newcommand{\nRV}[2]{{#1}_1, {#1}_2, \ldots, {#1}_{#2}} \newcommand{\pnRV}[3]{{#1}_1^{#3}, {#1}_2^{#3}, \ldots, {#1}_{#2}^{#3}} \newcommand{\onRV}[2]{{#1}_{(1)} \le {#1}_{(2)} \le \ldots \le {#1}_{(#2)}} \newcommand{\RR}{\mathbb{R}} \newcommand{\Prob}[1]{\mathbb{P}\left({#1}\right)} \newcommand{\PP}{\mathcal{P}} \newcommand{\iidd}{\overset{\mathsf{iid}}{\sim}} \newcommand{\X}{\times} \newcommand{\EE}[1]{\mathbb{E}\left[{#1}\right]} \newcommand{\Var}[1]{\mathsf{Var}\left({#1}\right)} \newcommand{\Ber}[1]{\mathsf{Ber}\left({#1}\right)} \newcommand{\Geom}[1]{\mathsf{Geom}\left({#1}\right)} \newcommand{\Bin}[1]{\mathsf{Bin}\left({#1}\right)} \newcommand{\Poi}[1]{\mathsf{Pois}\left({#1}\right)} \newcommand{\Exp}[1]{\mathsf{Exp}\left({#1}\right)} \newcommand{\SD}[1]{\mathsf{SD}\left({#1}\right)} \newcommand{\sgn}[1]{\mathsf{sgn}} \newcommand{\dd}[1]{\operatorname{d}\!{#1}} \]
4.3 Likelihood Approach
In the general approach, we make the following assumptions:
- We assume that a random variable \(X\) has a probability density function (pdf) or probability mass function (pmf) \(f(\cdot | p)\) with \(p \in \PP \subseteq \RR\).
- We have a sample \(X_1, X_2, \dots, X_n \iidd X\).
- The likelihood function of the sample \(X_1, X_2, \dots, X_n\) is defined as \(L(p; X_1, X_2, \dots, X_n) = \prod_{i=1}^{n} f(X_i | p)\).
Recall that the maximum likelihood estimator (MLE) is given by \(\hat{p} = \argmax_{p \in \PP}L(p; X_1, X_2, \dots, X_n)\).
We can view a hypothesis test as a restriction of \(\PP\) to a smaller subset \(\PP_0\). For example, in the “intuitive approach” discussed earlier, we have \(\PP = \{c\}\) and the hypotheses are \(H_0: \mu \in \PP_0\) and \(H_A: \mu \not\in \PP_0\).
4.3.1 MLE Approach under Null Hypothesis with \(p \in {\PP}_0 \subset \PP\)
To apply the MLE approach under the null hypothesis, we consider \(\hat{p} = \argmax_{p \in \PP} L(p; X_1, X_2, \dots, X_n)\).
We define the likelihood ratio as follows: \[ \lambda(X_1, X_2, \dots, X_n) = \frac{L(\hat{p}_0; X_1, X_2, \dots, X_n)}{L(\hat{p}; X_1, X_2, \dots, X_n)} \] and the log-likelihood ratio as: \[ \Lambda(X_1, X_2, \dots, X_n) = -\log{\lambda(X_1, X_2, \dots, X_n)} = -\log{\frac{L(\hat{p}_0; X_1, X_2, \dots, X_n)}{L(\hat{p}; X_1, X_2, \dots, X_n)}} \]
We have \(\Lambda(X_1, X_2, \dots, X_n) \geq 0\) and \(\Lambda(X_1, X_2, \dots, X_n) = \log{\frac{L(\hat{p}; X_1, X_2, \dots, X_n)}{L(\hat{p}_0; X_1, X_2, \dots, X_n)}}\).
If \(\hat{p}\) is far from \(\PP_0\) in terms of \(L\), the null hypothesis \(\PP_0\) is less likely to be true, resulting in larger values of \(\Lambda\).
4.3.2 \(Z\)-Test with Log-Likelihood
Consider \(X \sim N(\mu, \sigma^2)\) with \(\mu \in \PP = \RR\) and \(\sigma\) known. The null and alternate hypotheses are defined as \(H_0: \mu = c\) and \(H_A: \mu \neq c\), where \(\PP_0 = \{c\}\).
For a given sample \(X_1, X_2, \dots, X_n\), the log-likelihood function for \(\mu\) is: \[ L(\mu; X_1, X_2, \dots, X_n) = \prod_{i=1}^{n} \frac{e^{-\frac{(X_i - \mu)^2}{2\sigma^2}}}{\sqrt{2\pi}\sigma} \]
We can compute the log-likelihood ratio as follows: \[\begin{align*} \Lambda(X_1, X_2, \dots, X_n) &= \log{\frac{L(\mu; X_1, X_2, \dots, X_n)}{L(\mu_0; X_1, X_2, \dots, X_n)}} \\ &= \log{\frac{L(\overline{X}; X_1, X_2, \dots, X_n)}{L(c; X_1, X_2, \dots, X_n)}} \\ &= \log{\frac{\prod_{i=1}^{n} \frac{e^{-\frac{(X_i - \overline{X})^2}{2\sigma^2}}}{\sqrt{2\pi}\sigma}}{\prod_{i=1}^{n} \frac{e^{-\frac{(X_i - c)^2}{2\sigma^2}}}{\sqrt{2\pi}\sigma}}} \\ &= \frac{1}{2} \frac{n}{\sigma^2}(\overline{X} - c)^2 \\ &= \frac{1}{2} \left( \frac{\sqrt{n}(\overline{X} - c)}{\sigma} \right)^2 \end{align*}\]
Let \(Y_1, Y_2, \dots, Y_n\) be iid random variables that imitate the sample under \(H_0\). We need to check: \[ \Prob{ \Lambda(Y_1, Y_2, \dots, Y_n) \geq \Lambda(X_1, X_2, \dots, X_n)} \]
We know that: \[\begin{align*} \Lambda(Y_1, Y_2, \dots, Y_n) &= \frac{1}{2} \left( \frac{\sqrt{n}(\overline{Y} - c)}{\sigma} \right)^2 \\ &= \frac{Z^2}{2} \end{align*}\] where \(Z = \frac{\sqrt{n}(\overline{Y} - c)}{\sigma} \sim N(0,1)\). Thus, the \(p\)-value can be computed as \(\Prob{Z^2 \geq \left( \frac{\sqrt{n}(\overline{X} - c)}{\sigma} \right)^2}\).
4.3.3 Testing if the Mean is Larger than \(c\)
Consider \(X \sim N(\mu, \sigma^2)\) with \(\sigma\) known. We have the hypotheses \(H_0: \mu \leq c\) and \(H_A: \mu > c\).
Given a sample \(X_1, X_2, \dots, X_n\) from the population, we can compute the log-likelihood ratio: \[ \Lambda(X_1, X_2, \dots, X_n) = \log{\frac{L(\hat{\mu}; X_1, X_2, \dots, X_n)}{L(\hat{\mu}_0; X_1, X_2, \dots, X_n)}} \] where \(\hat{\mu}_0 = \argmax_{\mu \in \PP_0} L(\mu; X_1, \ldots, X_n)\) with \(\PP_0 = (-\infty, c]\) and \(\hat{\mu} = \argmax_{\mu \in \PP} L(\mu; X_1, \ldots, X_n)\) with \(\PP = \RR\).
Finally, we can compute the \(p\)-value as \(\Prob{\frac{\sqrt{n}(\overline{Y} - c)}{\sigma} \geq \frac{\sqrt{n}(\overline{X} - c)}{\sigma}} = \Prob{Z \geq \frac{\sqrt{n}(\overline{X} - c)}{\sigma}}\).
Exercises
Exercise 4.12 (From Examples) Do the following while reading the examples:
- \[ \textsf{If } \PP_0 \subseteq \PP \textsf{ then } 0 \leq \frac{L(\hat{p}_0; X_1, X_2, \dots, X_n)}{L(\hat{p}; X_1, X_2, \dots, X_n)} \leq 1 \]
- Show that, \[\begin{align*} \hat{\mu} = \argmax_{\mu \in \PP} \ L(\mu; X_1, X_2, \dots, X_n) = \overline{X} \end{align*}\] and, \[\begin{align*} \hat{\mu}_0 = \argmax_{\mu \in \PP_0} L(\mu; X_1, X_2, \dots, X_n) = c \end{align*}\]
- Prove that, \[ \log{\frac{\prod_{i=1}^{n} \frac{e^{-\frac{(X_i - \overline{X})^2}{2\sigma^2}}}{\sqrt{2\pi}\sigma}}{\prod_{i=1}^{n} \frac{e^{-\frac{(X_i - c)^2}{2\sigma^2}}}{\sqrt{2\pi}\sigma}}} = \frac{1}{2}\frac{n}{\sigma^2}(\overline{X} - c)^2 \]
- Show the following:
- \(\hat{\mu} = \overline{X}\)
- \(\hat{\mu}_0 = \argmax_{\mu \in \PP_0} \prod_{i=1}^{n} \frac{e^{-\frac{(X_i - \mu)^2}{2\sigma^2}}}{\sqrt{2\pi}\sigma} = \min{\{\overline{X}, c\}}\)
- \[\begin{align*} \Lambda(X_1, X_2, \dots, X_n) &= \log{\frac{L(\hat{\mu}; X_1, X_2, \dots, X_n)}{L(\hat{\mu}_0; X_1, X_2, \dots, X_n)}} \\ &= \begin{cases} 0 &\textsf{if } \ \overline{X} \leq c \\ \frac{n(\overline{X} - c)^2}{2\sigma^2} &\textsf{if } \ \overline{X} = c \end{cases} \end{align*}\]