Breusch Pagan Test

TL;DR

A test for whether the variance of regression errors is constant (homoscedastic) or not (heteroscedastic).
If the test rejects homoscedasticity, use weighted least squares (WLS) or robust standard errors for more reliable inference.
The test compares residual behavior from the original model to a regression of squared residuals and uses a chi-squared reference.

Definition

The Breusch-Pagan test is a statistical test used to assess the presence of heteroscedasticity in a linear regression model. It is based on the null hypothesis that the error terms are homoscedastic and the alternative that they are heteroscedastic. The test statistic is the ratio of the residual sum of squares (RSS) of the original regression model to the RSS of a regression model where the squared residuals are regressed on the independent variables; this ratio is treated as distributed according to a chi-squared statistic with degrees of freedom equal to the number of independent variables in the model.

Explanation

Linear regression assumes homoscedastic errors (constant variance across values of independent variables). When errors are heteroscedastic (unequal variances), ordinary least squares (OLS) estimates remain unbiased but are no longer the best (they do not have the smallest variance), and inference (e.g., standard errors, t-tests) may be inaccurate.

The Breusch-Pagan test proceeds by:

Fitting the original linear regression and obtaining its residuals.
Squaring those residuals and regressing the squared residuals on the independent variables.
Computing the ratio of the RSS from the original model to the RSS from the squared-residuals model and comparing that ratio to the critical value of a chi-squared distribution with degrees of freedom equal to the number of independent variables.

If the computed value exceeds the chi-squared critical value, reject the null hypothesis of homoscedasticity and conclude heteroscedasticity is present. If not, fail to reject the null hypothesis and conclude the error terms are homoscedastic. When heteroscedasticity is detected, remedies include using weighted least squares (WLS) or robust standard errors.

Mathematically (as described in the source), the test uses the ratio

\text{ratio} = \frac{RSS_1}{RSS_2}

where (RSS_1) is the residual sum of squares from the original regression and (RSS_2) is the residual sum of squares from the regression of squared residuals on the independent variables. The ratio is compared to a chi-squared distribution with degrees of freedom equal to the number of independent variables.

Examples

Stock returns vs market returns example

The source illustrates the Breusch-Pagan test with a dataset containing daily closing prices and market index daily returns, fitting a regression of stock returns on market returns and testing for heteroscedasticity.

Fitting the original model in R: model <- lm(stock_returns ~ market_returns)

Inspecting the model: summary(model)

Plotting residuals: plot(model$residuals)

Computing squared residuals and fitting the auxiliary regression: squared_residuals <- model$residuals^2

model2 <- lm(squared_residuals ~ market_returns)

Inspecting the auxiliary model: summary(model2)

Plotting residuals of the auxiliary model: plot(model2$residuals)

Computing RSS values, the ratio, and performing the chi-squared test: rss1 <- anova(model) $"Residuals"$ “Sum of Squares”

rss2 <- anova(model2)$"Residuals"$"Sum of Squares"

ratio <- rss1/rss2

df <- ncol(model$coefficients) - 1

chisq.test(ratio, df)

In the example, the computed value is larger than the chi-squared critical value, indicating heteroscedasticity; the recommendation is to use weighted least squares or robust standard errors for more accurate inference.

Use cases

Assessing whether error variance is constant in a fitted linear regression model, as part of regression diagnostics.
Determining whether to apply WLS or robust standard errors for valid inference when heteroscedasticity is present.

Notes or pitfalls

OLS estimates remain unbiased under heteroscedasticity but no longer have the smallest variance; inference can be inaccurate.
Remedies when heteroscedasticity is detected include weighted least squares (WLS) and robust standard errors.
Degrees of freedom for the chi-squared comparison equal the number of independent variables in the model.

Heteroscedasticity
Ordinary least squares (OLS)
Weighted least squares (WLS)
Robust standard errors
Chi-squared distribution
Residuals
Residual sum of squares (RSS)