Z Score
- Expresses how far a data point lies from the dataset mean in units of the standard deviation.
- Enables comparison of values within a dataset and across different datasets by standardizing scores.
- Can be used to estimate how likely a value is to occur (examples in source: ~68% and ~95% bands).
Definition
Section titled “Definition”A z-score, also known as a standard score, is a measure of how many standard deviations a particular data point is from the mean of a set of data. It is used to compare how different a particular data point is from the mean of the data set and to determine how likely it is to occur.
The z-score is computed by subtracting the mean from the observation and dividing by the standard deviation:
Explanation
Section titled “Explanation”A positive z-score indicates the data point lies above the mean; a negative z-score indicates it lies below the mean. Converting observations to z-scores standardizes values so that they can be compared directly, even when they come from different datasets with different means and standard deviations. Z-scores can also be interpreted in terms of how likely a value is to occur within the distribution (as illustrated in the examples).
Examples
Section titled “Examples”Heights example
Section titled “Heights example”For a group whose mean height is 5 feet 6 inches and standard deviation is 2 inches:
- A person who is 6 feet tall:
This means the person’s height is 3 standard deviations above the mean.
- A person who is 5 feet 2 inches tall:
This means the person’s height is 2 standard deviations below the mean.
Test scores example
Section titled “Test scores example”Comparing scores from two groups that took the same test:
- First group: mean 80, standard deviation 10. A student who scored 90:
- Second group: mean 70, standard deviation 5. A student who scored 90:
Although both students scored 90, the student in the second group has a higher z-score, indicating their score is more unusual relative to their group’s mean.
Likelihood example
Section titled “Likelihood example”For a dataset with mean 50 and standard deviation 10:
- A data point with z-score 1 is 1 standard deviation above the mean and “is likely to occur approximately 68% of the time.”
- A data point with z-score 2 is 2 standard deviations above the mean and “is likely to occur approximately 95% of the time.”
- A data point with z-score -1 is 1 standard deviation below the mean and “is likely to occur approximately 32% of the time.”
- A data point with z-score -2 is 2 standard deviations below the mean and “is likely to occur approximately 5% of the time.”
Use cases
Section titled “Use cases”- Comparing individual data points to the mean of a dataset.
- Standardizing and comparing data from different datasets or populations.
- Estimating how likely a particular data point is to occur within a distribution.
Related terms
Section titled “Related terms”- Standard score (synonym)
- Mean
- Standard deviation