\[ \newcommand{\nRV}[2]{{#1}_1, {#1}_2, \ldots, {#1}_{#2}} \newcommand{\pnRV}[3]{{#1}_1^{#3}, {#1}_2^{#3}, \ldots, {#1}_{#2}^{#3}} \newcommand{\onRV}[2]{{#1}_{(1)} \le {#1}_{(2)} \le \ldots \le {#1}_{(#2)}} \newcommand{\RR}{\mathbb{R}} \newcommand{\Prob}[1]{\mathbb{P}\left({#1}\right)} \newcommand{\PP}{\mathcal{P}} \newcommand{\iidd}{\overset{\mathsf{iid}}{\sim}} \newcommand{\X}{\times} \newcommand{\EE}[1]{\mathbb{E}\left[{#1}\right]} \newcommand{\Var}[1]{\mathsf{Var}\left({#1}\right)} \newcommand{\Ber}[1]{\mathsf{Ber}\left({#1}\right)} \newcommand{\Geom}[1]{\mathsf{Geom}\left({#1}\right)} \newcommand{\Bin}[1]{\mathsf{Bin}\left({#1}\right)} \newcommand{\Poi}[1]{\mathsf{Pois}\left({#1}\right)} \newcommand{\Exp}[1]{\mathsf{Exp}\left({#1}\right)} \newcommand{\SD}[1]{\mathsf{SD}\left({#1}\right)} \newcommand{\sgn}[1]{\mathsf{sgn}} \newcommand{\dd}[1]{\operatorname{d}\!{#1}} \]
3.2 Method of moments
Let \(X_1, X_2, \dots, X_n\) be \(\rm iid\) sample form \(X\) where \(X \sim f(\circ | {\bf p})\) with \({\bf p} = (p_1, \ldots, p_d) \in \R^d, d \geq 1\). Let, \(k \geq 1\), \(m_k : \R^n \to \R\) given by, \[ m_k({\bf x}) = \frac{1}{n} \sum_{i=1}^n x_i^k \]
Note that, \(m_k({\bf X}) = \frac{1}{n} \sum_{i=1}^n X_i^k\) (refer it as \(k\)th moment of the sample)
and \(\mu_k = \EE{X^k} =\) \(k\)th moment of the distribution of \(X\)
For, \(X \sim f(\circ | {\bf p})\), we may view \(\mu_k\) as a function of \({\bf p}\) ,i.e., \(\mu_k \equiv \mu_k(p_1, \ldots, p_d)\)
The Method of moments estimator for \({\bf p}\) is obtained by equating the first \(d\) moments of the sample to the corresponding moments of the distribution.
Specifically, \[\begin{equation} \tag{3.1} \mu_k({\bf p}) = m_k({\bf X}) \qquad \qquad k = 1,\ldots, d \end{equation}\]
where we get \(d\) equations in \(d\) unknowns for a sample \(X \sim f(\circ | {\bf p})\). There is no gaurantee of unique solution and getting exact expression of the solution(s).
Denote the solution(s) of (3.1) as \(\hat{p}_1,\ldots,\hat{p}_d\) and they will be written in terms of realised \(m_k({\bf X}), k = 1,\ldots, d\)
Example 3.3 (Binomial Distribution) Suppose we have a sample \(X_1, X_2, \dots, X_{10}\) \(\rm iid\) form \(X \sim \Bin{N,p}\) (\(p, N\) unknown)
Assume that the realised values are
\[{\bf X} = (X_1, X_2, \dots, X_{10}) = (8,7,6,11,8,5,3,7,6,9)\]
It is immediate to see (from sample)
\[\begin{equation}
\tag{3.2}
m_1({\bf X}) = \frac{1}{10} \sum_{i=1}^{10} X_i = 7
\end{equation}\]
\[\begin{equation}
\tag{3.3}
m_2({\bf X}) = \frac{1}{10} \sum_{i=1}^{10} X_i^2 = 53.4
\end{equation}\]
Also, we know that,
\[\begin{align*}
&f(k|N,p) = \binom{N}{k}p^k(1-p)^{N-k}, k=0,1,\ldots, N \\
&\mu_1 = \EE{X} = Np \\
&\mu_2 = \EE{X^2} = \Var{X} + \EE{X^2} = Np(1-p) + (Np)^2
\end{align*}\]
equating, \(\mu_1 = m_1\) and \(\mu_2 = m_2\) (method of moments)
we obtain, \(Np = 7\) and \(Np(1-p) + (Np)^2 = 53.4\)
using elementary algebra, we get \(\hat{N} \approx 19\) and \(\hat{p} \approx 0.371\)
So, for the sample, \[{\bf X} = (8,7,6,11,8,5,3,7,6,9)\] the Method of moments estimate are, \[ \hat{N} \approx 19 \textsf{ and } \hat{p} \approx 0.371 \]
Abstractly, if \(X \sim \Bin{N,p}\)
Then, sampling \(X_1, X_2, \dots, X_n\) \(\rm iid\) from \(X\) and equating \(\mu_1 = Np = m_1\) and \(\mu_2 = Np(1-p) + (Np)^2 = m_2\) we get,
\[\begin{equation}
\tag{3.4}
\hat{N} = \frac{m_1^2}{m_1 - (m_2-m_1^2)}
\end{equation}\]
and,
\[\begin{equation}
\tag{3.5}
\hat{p} = \frac{m_1 - (m_2-m_1^2)}{m_1}
\end{equation}\]
Example 3.4 (Normal Distribution) Suppose \(X \sim N(\mu, \sigma^2)\). Sample \(X_1, X_2, \dots, X_n\) \(\rm iid\) from \(X\).
We get sample moments, \(m_1({\bf X})\) and \(m_2({\bf X})\).
We know,
\[\begin{align*}
\tag{3.6}
\mu_1 &= \EE{X} &= \mu \\
\mu_2 &= \EE{X^2} &= \mu^2 + \sigma^2
\end{align*}\]
Equating,
\[\begin{align*}
\tag{3.7}
\mu &= m_1 \\
\mu^2 + \sigma^2 &= m_2
\end{align*}\]
and solving, we obtain,
\[\begin{align*}
\tag{3.8}
\hat{\mu} &= m_1 = \overline{X} \\
\hat{\sigma} &= \sqrt{m_2 - m_1^2} \equiv \sqrt{\frac{n-1}{n}}S
\end{align*}\]
\(\therefore\) Sample mean \(\overline{X}\) and \(\sqrt{\textsf{scaled( with scaling factor }\frac{n-1}{n} \textsf{) sample variance }S^2}\) are the Method of moments estimate for \(\mu\) and \(\sigma\) from the sample \(X_1, X_2, \dots, X_n\)