Interquartile range :
The interquartile range (IQR) is a measure of the dispersion of a dataset. It is calculated as the difference between the 75th and 25th percentiles of a dataset. These percentiles are also known as the first and third quartiles, hence the name of the measure.
For example, consider the following dataset:
1, 2, 3, 4, 5, 6, 7, 8, 9
To calculate the IQR, we first need to find the 25th and 75th percentiles of the dataset. To do this, we first need to sort the data in ascending order:
1, 2, 3, 4, 5, 6, 7, 8, 9
The 25th percentile is the value at the 25% mark, which is the average of the 2nd and 3rd values in the dataset: (2+3)/2 = 2.5
The 75th percentile is the value at the 75% mark, which is the average of the 7th and 8th values in the dataset: (7+8)/2 = 7.5
To calculate the IQR, we simply subtract the 25th percentile from the 75th percentile: 7.5 – 2.5 = 5
In this example, the IQR is 5, indicating that the values in the dataset are dispersed over a range of 5 units.
As another example, consider the following dataset:
1, 1, 2, 2, 3, 3, 4, 4, 5, 5
Sorting the data in ascending order, we get:
1, 1, 2, 2, 3, 3, 4, 4, 5, 5
The 25th percentile is the average of the 2nd and 3rd values in the dataset: (1+2)/2 = 1.5
The 75th percentile is the average of the 7th and 8th values in the dataset: (4+4)/2 = 4
To calculate the IQR, we again subtract the 25th percentile from the 75th percentile: 4 – 1.5 = 2.5
In this example, the IQR is 2.5, indicating that the values in the dataset are dispersed over a range of 2.5 units.
While the IQR is a useful measure of dispersion, it is important to note that it is not affected by extreme values or outliers in a dataset. For example, if we add an extremely large value to the second dataset above, such as 100, the IQR would still be 2.5, as the extreme value would not be included in the calculation of the 25th and 75th percentiles.
In contrast, measures such as the range, which is simply the difference between the maximum and minimum values in a dataset, would be greatly affected by the presence of extreme values. This makes the IQR a more robust measure of dispersion than the range, as it is not affected by the presence of outliers.
The IQR is often used in conjunction with other measures of central tendency, such as the mean and median, to provide a more complete picture of the distribution of a dataset. For example, a dataset with a large IQR and a low median may indicate a skewed distribution, while a dataset with a small IQR and a high mean may indicate a more evenly distributed dataset.
Overall, the IQR is a valuable tool for understanding the dispersion of a dataset and comparing it to other datasets. It is a robust measure that is not affected by extreme values, and can be used in conjunction with other measures to better understand the distribution of a dataset.