Standard Deviation

What is Standard Deviation :

Standard deviation is a statistical measure that quantifies the amount of variation or dispersion of a set of data. It is an important tool in statistical analysis as it gives an indication of how spread out the data is and how much the individual values deviate from the mean.

To understand standard deviation, it is first necessary to understand what the mean is. The mean is simply the average of a set of data, calculated by adding all the values together and dividing by the total number of values. For example, if we have a set of five values: 2, 4, 6, 8, and 10, the mean would be (2+4+6+8+10)/5 = 6.

Now, let’s say we have a second set of five values: 1, 3, 5, 7, and 9. The mean of this set would be (1+3+5+7+9)/5 = 5. If we compare these two sets, we can see that the mean of the first set is higher than the mean of the second set. However, this does not necessarily mean that the first set has higher values overall. To get a better understanding of the distribution of the data, we can use standard deviation.

Standard deviation is calculated by taking the square root of the variance, which is the average of the squared differences between each value and the mean. In other words, it measures how much the values deviate from the mean.

For our first set of values (2, 4, 6, 8, and 10), the variance would be calculated as follows:

(2-6)^2 + (4-6)^2 + (6-6)^2 + (8-6)^2 + (10-6)^2 / 5

This simplifies to:

(4+4+0+4+16)/5 = 8

The standard deviation is then the square root of the variance, which is sqrt(8) = 2.83.

For our second set of values (1, 3, 5, 7, and 9), the variance would be calculated as follows:

(1-5)^2 + (3-5)^2 + (5-5)^2 + (7-5)^2 + (9-5)^2 / 5

This simplifies to:

(16+4+0+4+16)/5 = 8

The standard deviation is then the square root of the variance, which is sqrt(8) = 2.83.

We can see that, even though the mean of the first set is higher than the mean of the second set, the standard deviation is the same for both sets. This is because the standard deviation measures the dispersion of the data, not the overall value. In both sets, the values are equally spread out from the mean, with some values being higher and some being lower.

Standard deviation is often used in statistical analysis to determine the likelihood of a certain event occurring. For example, if we are studying the height of a group of people, we can use the standard deviation to determine the likelihood that a person in the group will be above or below a certain height. If the standard deviation is large, it means that the heights of the people in the group are widely dispersed and there is a higher likelihood of encountering people who are taller or shorter than the average height. On the other hand, if the standard deviation is small, it means that the heights of the people in the group are more closely clustered around the average and it is less likely to encounter people who are significantly taller or shorter.

Filed under: S - @ 3:13 pm

Data Science Wiki

Unlocking the power of data science, one term at a time.

Archives

Categories

Recent Posts

Recent Comments

Categories

Standard Deviation

What is Standard Deviation :