About

Maximum Likelihood Method estimates parameters by maximizing the probability of observing the given data, assuming a specific probability distribution for the process.

Core Concept

The likelihood function is the joint probability density viewed as a function of parameters , with data fixed.

The maximum likelihood estimator (MLE) is:

Usually, maximize for computational convenience.

Advantages

  • Uses all information in the data (not just moments)
  • Asymptotically efficient (lowest variance)
  • Consistent and asymptotically normal
  • Works well for large samples

Limitations

  • Requires specification of full distributional form
  • Computationally intensive for complex models

Example: AR(1) Model

Assume white noise .

Recall for , the pdf is:

Step 1: Joint Density of Errors

The AR(1) errors are white noise: for . Since they are independent and identically distributed, the joint density equals the product of individual densities.

Derivation:

Recall the pdf of a normal random variable is:

Since has mean zero and variance :

The joint density is the product of independent densities:

Step 2: Marginal Density of

For a stationary AR(1) process, the first observation follows the stationary (marginal) distribution. The AR(1) model is:

The stationary distribution is derived by noting that in the long run:

From the model equation:

The stationary mean is . Thus:

Derivation:

Substituting into the normal pdf with mean and variance :

Step 3: Complete Likelihood

The complete likelihood is derived using the chain rule of probability (multiplication rule). This factorizes the joint density of all observations.

Derivation:

By the definition of conditional probability:

Applying this to the joint density of observations:

Since for , the distribution of future errors depends only on through the deterministic relationship, not through stochastic dependence. Therefore:

The complete likelihood is:

Step 4: Log-Likelihood

The log-likelihood is obtained by combining Steps 1, 2, and 3, and taking the natural logarithm.

Derivation:

First, multiply the marginal density from Step 2 with the joint error density from Step 1:

Taking the natural logarithm:

Where the unconditional sum of squares is:

Step 5: Estimate

The estimate of is obtained by maximizing the log-likelihood with respect to , treating and as already estimated.

Derivation:

From Step 4, the log-likelihood is:

Taking the partial derivative with respect to :

Setting equal to zero for maximization:

After obtaining and (by maximizing the concentrated likelihood), the estimator is: