# Breusch-Pagan test

## Breusch-Pagan test :

The Breusch-Pagan test is a statistical test used to assess the presence of heteroscedasticity in a linear regression model. Heteroscedasticity refers to the unequal variance of the error terms in the model, which can lead to biased estimates of the regression coefficients and inaccurate inference.
To understand the Breusch-Pagan test, it is helpful to first review the assumptions of linear regression. One of these assumptions is that the error terms are homoscedastic, meaning that they have constant variance across different values of the independent variables. This assumption allows us to use the ordinary least squares (OLS) method to estimate the regression coefficients, which are unbiased and have the smallest possible variance.
However, if the error terms are heteroscedastic, the OLS estimates are still unbiased but they are no longer the best estimators. In this case, the variance of the error terms increases with the magnitude of the independent variables, leading to inaccurate inference. To deal with this problem, we can use weighted least squares (WLS) to adjust the weights of the observations according to their variances, or we can use robust standard errors to obtain more accurate inference.
The Breusch-Pagan test is a statistical test that assesses the presence of heteroscedasticity in a linear regression model. It is based on the null hypothesis that the error terms are homoscedastic, and the alternative hypothesis that they are heteroscedastic. The test statistic is the ratio of the residual sum of squares (RSS) of the original regression model to the RSS of a regression model where the squared residuals are regressed on the independent variables. This ratio is distributed as a chi-squared statistic with degrees of freedom equal to the number of independent variables in the model.
To conduct the Breusch-Pagan test, we first fit a linear regression model with the dependent variable and the independent variables. Then, we compute the residuals of the model and square them to obtain the squared residuals. Next, we fit a new regression model with the squared residuals as the dependent variable and the independent variables as the independent variables. Finally, we compute the ratio of the RSS of the original model to the RSS of the squared residuals model and compare it to the critical value of the chi-squared distribution with the appropriate degrees of freedom.
If the computed value is larger than the critical value, we reject the null hypothesis and conclude that the error terms are heteroscedastic. If the computed value is smaller than the critical value, we fail to reject the null hypothesis and conclude that the error terms are homoscedastic.
Let’s illustrate the Breusch-Pagan test with an example. Suppose we have a dataset with the daily closing prices of a stock and the daily returns of the stock market index. We want to investigate whether there is a relationship between the stock returns and the market returns, and whether the error terms are homoscedastic. We can use the Breusch-Pagan test to assess this.
First, we fit a linear regression model with the stock returns as the dependent variable and the market returns as the independent variable. We can use the lm() function in R to fit the model and the summary() function to obtain the results.
model <- lm(stock_returns ~ market_returns)
summary(model)
The output shows the estimated coefficients of the model, the t-statistics, the p-values, and the residuals. We can plot the residuals to check for any patterns or outliers.
plot(model\$residuals)
The plot shows that the residuals are not evenly distributed across different values of the independent variable, suggesting the presence of heteroscedasticity.
Next, we compute the squared residuals and fit a new regression model with them as the dependent variable and the independent variable as the independent variable. We can use the lm() function again to fit the model and the summary() function to obtain the results.
squared_residuals <- model\$residuals^2
model2 <- lm(squared_residuals ~ market_returns)
summary(model2)
The output shows the estimated coefficients of the model, the t-statistics, the p-values, and the residuals. We can plot the residuals to check for any patterns or outliers.
plot(model2\$residuals)
The plot shows that the residuals are evenly distributed across different values of the independent variable, indicating that the squared residuals model is homoscedastic.
Finally, we compute the ratio of the RSS of the original model to the RSS of the squared residuals model and compare it to the critical value of the chi-squared distribution with the appropriate degrees of freedom. We can use the anova() function in R to compute the RSS of the two models and the chisq.test() function to compute the critical value.