Five-number Summary
- Condenses a distribution into five key statistics: minimum, first quartile (Q1), median, third quartile (Q3), and maximum.
- Computed after ordering the data; useful for quick inspection of spread and center.
- Helps identify potential outliers by comparing values to these extremes and quartiles.
Definition
Section titled “Definition”A five-number summary is a method of summarizing a dataset by providing a concise description of its key features. It consists of the minimum value, the maximum value, the median, the first quartile, and the third quartile.
Explanation
Section titled “Explanation”To compute a five-number summary:
- Order the observations from least to greatest.
- Report:
- The minimum (lowest) value.
- The first quartile (Q1), which divides the lower half of the dataset into two parts.
- The median, the middle value of the ordered dataset.
- The third quartile (Q3), which divides the upper half of the dataset into two parts.
- The maximum (highest) value.
These five numbers provide a rough sketch of the data distribution and can be used to identify potential outliers.
Examples
Section titled “Examples”Generic example (100 numbers)
Section titled “Generic example (100 numbers)”For a dataset of 100 numbers, after ordering:
- Minimum: 1
- Maximum: 100
- Median: 50
- First quartile (Q1): 25
- Third quartile (Q3): 75
Using these five numbers, one can quickly summarize the data and identify potential outliers (for example, a value much lower than 1 or much higher than 100).
Class test scores (20 students)
Section titled “Class test scores (20 students)”Raw scores: 80, 85, 90, 95, 90, 85, 80, 75, 70, 65, 60, 55, 50, 45, 40, 35, 30, 25, 20, 15
Ordered: 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 80, 85, 85, 90, 90, 95
Five-number summary:
- Minimum: 15
- Maximum: 95
- Median: 60
- First quartile (Q1): 40
- Third quartile (Q3): 85
Using these five numbers, a score of 100 would be identified as an outlier because it is much higher than the maximum value of 95.
Use cases
Section titled “Use cases”- Quickly summarizing the central tendency and spread of a dataset.
- Identifying potential outliers.
- Supporting simple visualizations of distribution (e.g., as an input to a boxplot).
Notes or pitfalls
Section titled “Notes or pitfalls”- The five-number summary provides a rough sketch and may not capture all distribution details.
- The source describes identifying outliers by comparing values to the reported minimum and maximum (e.g., values much lower than 1 or much higher than 100 in the generic example, or a score of 100 compared with a maximum of 95 in the test-score example).
Related terms
Section titled “Related terms”- Median
- Quartile
- Outlier