High Breakdown Methods :
In robust statistics, high breakdown methods are statistical methods that have a high breakdown point, which is the maximum fraction of outliers that the method can handle before it becomes substantially less effective. This means that these methods are resistant to the effects of outliers, which are data points that are significantly different from the majority of the data.
Two examples of high breakdown methods are the median absolute deviation (MAD) and the Tukey biweight.
The median absolute deviation (MAD) is a measure of the dispersion of a dataset, similar to the standard deviation. However, unlike the standard deviation, which is sensitive to outliers, the MAD is more robust to the effects of outliers. This is because the MAD is based on the median of the data rather than the mean, which is sensitive to outliers.
To calculate the MAD, we first need to find the median of the data. Then, for each data point, we calculate the absolute difference between the data point and the median. Next, we take the median of these absolute differences. This gives us the MAD.
The MAD is a more robust measure of dispersion than the standard deviation because the median is less sensitive to outliers than the mean. This means that the MAD is less affected by outliers and can be used in situations where the data may contain some outliers.
Another example of a high breakdown method is the Tukey biweight. This is a method for estimating the location and scale of a dataset, similar to the mean and standard deviation. However, unlike the mean and standard deviation, which are sensitive to outliers, the Tukey biweight is more robust to the effects of outliers.
The Tukey biweight uses a weighting function that gives smaller weights to data points that are further away from the center of the data. This means that outliers have a smaller influence on the estimate of the location and scale of the data.
To calculate the Tukey biweight, we first need to estimate the center of the data. This can be done using the median or the M-estimator. Then, for each data point, we calculate the difference between the data point and the center. Next, we apply the weighting function to the differences, which gives us the weighted differences. Finally, we sum the weighted differences and divide by the sum of the weights to get the Tukey biweight.
The Tukey biweight is a more robust measure of location and scale than the mean and standard deviation because it gives smaller weights to outliers. This means that the Tukey biweight is less affected by outliers and can be used in situations where the data may contain some outliers.
In conclusion, the median absolute deviation (MAD) and the Tukey biweight are examples of high breakdown methods in robust statistics. These methods are resistant to the effects of outliers and can be used in situations where the data may contain some outliers. They are more robust than traditional methods, such as the standard deviation and the mean, which are sensitive to outliers.