Skip to content

Akaike Information Criterion

  • AIC balances model fit and complexity by combining RSS with a penalty based on parameter count.
  • Lower AIC indicates a preferred model when comparing alternatives.
  • Commonly used for model selection (choose the model with the smallest AIC).

Akaike information criterion, also known as AIC, is a statistical measure used to evaluate the quality of a model by comparing the goodness of fit of the model with the number of parameters in the model. The AIC is calculated by adding the residual sum of squares (RSS) and twice the number of parameters in the model. The AIC is often used in model selection, where the goal is to choose the model with the smallest AIC value.

AIC quantifies a trade-off between how well a model fits the observed data (measured here by RSS) and the model complexity (measured by the number of parameters). For two competing models, compute each model’s RSS, add twice its parameter count, and select the model with the smaller AIC value. This procedure favors models that achieve good fit with fewer parameters.

Suppose a dataset has two predictor variables, x1 and x2, and one response variable, y. Fit two linear regression models:

  • Model with only x1: y=β0+β1x1y = \beta_0 + \beta_1 x_1

  • Model with x1 and x2: y=β0+β1x1+β2x2y = \beta_0 + \beta_1 x_1 + \beta_2 x_2

Compute the residual sum of squares (RSS) for each model by summing the squared differences between observed and predicted y:

RSS1=(yy^)2\text{RSS}_1 = \sum (y - \hat{y})^2

RSS2=(yy^)2\text{RSS}_2 = \sum (y - \hat{y})^2

Then compute AIC for each model by adding RSS and twice the number of parameters in the model:

AIC1=RSS1+2(1)\text{AIC}_1 = \text{RSS}_1 + 2(1)

where the 1 in the parentheses represents the number of parameters in the model (i.e., \beta_0 and \beta_1).

AIC2=RSS2+2(2)\text{AIC}_2 = \text{RSS}_2 + 2(2)

where the 2 in the parentheses represents the number of parameters in the model (i.e., \beta_0, \beta_1, and \beta_2).

Choose the model with the smaller AIC value. In this example, the model with only x1 as a predictor is chosen because it has a smaller AIC value than the model with both x1 and x2 as predictors.

  • Model selection: choose among competing models by selecting the one with the smallest AIC.
  • Residual sum of squares (RSS)