About
Maximum Likelihood Method estimates parameters by maximizing the probability of observing the given data, assuming a specific probability distribution for the process.
Core Concept
The likelihood function is the joint probability density viewed as a function of parameters , with data fixed.
The maximum likelihood estimator (MLE) is:
Usually, maximize for computational convenience.
Advantages
- Uses all information in the data (not just moments)
- Asymptotically efficient (lowest variance)
- Consistent and asymptotically normal
- Works well for large samples
Limitations
- Requires specification of full distributional form
- Computationally intensive for complex models
Example: AR(1) Model
Assume white noise .
Recall for , the pdf is:
Step 1: Joint Density of Errors
The AR(1) errors are white noise: for . Since they are independent and identically distributed, the joint density equals the product of individual densities.
Derivation:
Recall the pdf of a normal random variable is:
Since has mean zero and variance :
The joint density is the product of independent densities:
Step 2: Marginal Density of
For a stationary AR(1) process, the first observation follows the stationary (marginal) distribution. The AR(1) model is:
The stationary distribution is derived by noting that in the long run:
From the model equation:
The stationary mean is . Thus:
Derivation:
Substituting into the normal pdf with mean and variance :
Step 3: Complete Likelihood
The complete likelihood is derived using the chain rule of probability (multiplication rule). This factorizes the joint density of all observations.
Derivation:
By the definition of conditional probability:
Applying this to the joint density of observations:
Since for , the distribution of future errors depends only on through the deterministic relationship, not through stochastic dependence. Therefore:
The complete likelihood is:
Step 4: Log-Likelihood
The log-likelihood is obtained by combining Steps 1, 2, and 3, and taking the natural logarithm.
Derivation:
First, multiply the marginal density from Step 2 with the joint error density from Step 1:
Taking the natural logarithm:
Where the unconditional sum of squares is:
Step 5: Estimate
The estimate of is obtained by maximizing the log-likelihood with respect to , treating and as already estimated.
Derivation:
From Step 4, the log-likelihood is:
Taking the partial derivative with respect to :
Setting equal to zero for maximization:
After obtaining and (by maximizing the concentrated likelihood), the estimator is: