3.3 Maximum liklihood estimate

Definition 3.2 (Maximum likelihood estimator) The likelihood function for the sample \(X_1, X_2, \dots, X_n\) is, \[\begin{align*} L: \PP \X \RR^n \to \RR \textsf{ given by,} \\ L(p|{\bf X}) := \prod_{i=1}^n f(X_i|p) \end{align*}\] For a given \({\bf X} = (X_1, X_2, \dots, X_n)\) suppose \(\hat{p} \equiv \hat{p}({\bf X})\) is a point such that \(L(\hat{p}|{\bf X}) = \sup_{p \in \PP} L(p|{\bf X})\) ,i.e., \(L\) attains it’s maximum at \(\hat p\) as a function of \(p\). Then, \(\hat{p}\) is is called a maximum likelihood estimator of \(p\) (or abbreviated as MLE of \(p\)) given the sample \(X_1, X_2, \dots, X_n\)

Example 3.5 (Normal with one parameter) Let, \(X \sim N(p,1)\). Sampling \(X_1, X_2, \dots, X_n\) \(\rm iid\) from \(X\). Then the likelihood function \(L: \PP \X \R^n \to \R\) is given by, \[\begin{align*} \tag{3.9} L(p|{\bf X}) &= \prod_{i=1}^n \frac{e^{-\frac{(X_i-p)^2}{2}}}{\sqrt{2\pi}} \\ &= \left( \frac{1}{\sqrt{2\pi}} \right)^n e^{-\sum_{i=1}^n \frac{(X_i-p)^2}{2}} \end{align*}\] To find MLE, treat \(X_1, X_2, \dots, X_n\) as fixed and maximize \(L\) as a function of \(p\)
Observe that, \(\exp(-x)\) is a strictly decreasing function of \(x\).

$\exp(-x)$

Figure 3.1: \(\exp(-x)\)

and, \(\left( \frac{1}{\sqrt{2\pi}} \right)^n\) is constant w.r.t. \(p\). So, maximizing \(L(p|{\bf X})\) is equivalent to minimizing \(g : \RR \to \R\) with \(g(p) = \sum_{i=1}^n (X_i-p)^2\)

Method 1 (Using inequality)

\[\begin{align*} g(p) &= \sum_{i=1}^n (X_i-p)^2 \\ &= \sum_{i=1}^n \left(X_i - \overline{X} +\overline{X} - p\right)^2 \\ &= \sum_{i=1}^n \left(X_i - \overline{X}\right)^2 + n(\overline{X} - p)^2 \end{align*}\] Observe that, \(g(p) \geq \sum_{i=1}^n \left(X_i-\overline{X}\right)^2 \ \forall p\in\R\) and \(g(\overline{X}) = \sum_{i=1}^n \left(X_i-\overline{X}\right)^2\)
Thus, MLE of \(p\) given \(X_1, X_2, \dots, X_n\) is \(\hat{p} = \overline{X}\)

Method 2 (Using calculus)

\[\begin{align*} g'(p) &= -2\sum_{i=1}^n \left(X_i-p\right) \\ \textsf{ and, } g''(p) &= 2 > 0 \end{align*}\] Now, \[\begin{align*} g'(p) = 0 \implies \sum_{i=1}^n \left(X_i-p\right) = 0 \implies p = \overline{X} \textsf{ and, } g''(\overline{X}) > 0 \end{align*}\] \(\therefore p=\overline{X}\) is a local minimum (global too) (See Exercise 3.4) of \(g\). Hence, \(\hat{p}=\overline{X}\) is the MLE of \(p\) given \(X_1, X_2, \dots, X_n\)

Exercises

Exercise 3.4 (Form Example 3.5) Show the following:

  • Show using elementary algebra, \[\sum_{i=1}^n \left(X_i - p\right)^2 = \sum_{i=1}^n \left(X_i-\overline{X}\right)^2 + n(\overline{X} - p)^2\]
  • Show that \(p=\overline{X}\) is the global minimum of \(g\).

Exercise 3.5 (Raleigh Distribution) Gobarkanth collects \(X_1, X_2, \dots, X_n\) of \(\rm iid\) measurements of radiation from Gobar Gas plant of canteen. He assumes that the observations follow \(\alpha\) Rayleigh distribution with parameter \(\alpha\), with pdf given by, \[\begin{equation*} f(x) = \begin{cases} \alpha x \exp{\left( -\frac{1}{2}\alpha x^2 \right)} & \textsf{ if } x \geq 0 \\ 0 & \textsf{ otherwise } \end{cases} \end{equation*}\] Find the maximum likelihood estimate for \(\alpha\), providing appropriate justifcation.

Exercise 3.6 (capture-recapture and hypergeometric distribution) Biologists use a technique called capture-recapture to estimate the size of the population of a species that cannot be directly counted. The following exercise illustrates the role a hypergeometric distribution plays in such an estimate.

Suppose there is a species of unknown population size \(N\). Suppose fifty members of the species are selected and given an identifying mark. Sometime later a sample of size twenty is taken from the population and it is found that four of the twenty were previously marked.

  1. \(N\) be the number of population in the wild. Write down the likelihood function for \(N\) given the above data.
  2. Plot the likelihood function for \(N\).
  3. Use the optimize() function in R to find the maximum likelihood estimate for \(N\).
  4. Can you compute the MLE for \(N\) using calculus?

The basic idea behind mark-recapture is that since the sample showed \(\frac{4}{20} = 20\%\) marked members, that should also be a good estimate for the fraction of marked members of the species as a whole. However, for the whole species that fraction is \(\frac{50}{N}\) which provides a population estimate of \(N \approx 250\)

Exercise 3.7 (Head Tail) A coin is flipped \(100\) times and \(55\) heads occurred. Assume that the coin has probability of heads being \(p\).

  1. Write an R-function that will compute the value of the likelihood function for any value of \(p\).
  2. Plot the likelihood function for \(p \in (0,1)\).
  3. Use the optimize() function in R to find the maximum likelihood estimate for \(p\).
  4. Compute the MLE for \(p\) using calculus and see how close is to the answer in the previous step.
  5. Do the above steps if the number of observed heads was \(30\), and \(70\).

Exercise 3.8 (Binomial Distribution) Suppose we have n samples \(X_1, X_2, \dots, X_n\) from \(\Bin{N,p}\). We are told the value of \(M = \max{\left\{ X_1, X_2, \dots, X_n \right\}}\).

  1. Find the Probability mass function of \(M\).
  2. Write an R-function that will compute the value of the likelihood function \(L(p | M, N, n)\) for any value of \(p\).
  3. Suppose \(M = 30, N = 50, n = 10\).
    1. Plot the likelihood function for \(p \in (0,1)\).
    2. Use the optimize() function in R to find the maximum likelihood estimate for \(p\).
    3. Can you compute the MLE for \(p\) using calculus?
  4. Do the previous step if \(M\) is now \(20\), or \(40\).

Exercise 3.9 (Diploid Genotype) Suppose that a particular gene occurs as one of two alleles (\(A\) and \(a\)), where allele \(A\) has frequency \(\theta\) in the population. That is, a random copy of the gene is \(A\) with probability \(\theta\) and \(a\) with probability \((1 - \theta)\). Since a diploid genotype consists of two genes, the probability of each genotype is given by:

Genotype \(AA\) \(Aa\) \(aa\)
Probability \(\theta^2\) \(2\theta(1-\theta)\) \((1-\theta)^2\)

Suppose we test a random sample of the population and find that \(k\) are \(AA\), \(l\) are \(Aa\), and \(m\) are \(aa\).

  1. Write an R-function that will compute the value of the likelihood function for any value of \(\theta\) given \(k, l, m\).
  2. Suppose \(k = 10, l = 30, m = 20\).
    1. Plot the likelihood function for \(\theta \in (0,1)\).
    2. Use the optimize() function in R to find the maximum likelihood estimate for \(\theta\).
    3. Compute the MLE for \(\theta\) using calculus and see how close is to the answer in the previous step.
  3. Do the previous step if the \(k = l = m = 30\).