Skip to content

Correlation

  • Quantifies how strongly two variables move together and in what direction (same or opposite).
  • Common measures are the Pearson correlation coefficient (ranges from -1 to 1) and the Spearman rank correlation (uses ranks for non-normal or non-linear relationships).
  • A correlation does not imply causation; correlated variables may be linked by other factors.

Correlation is a statistical measure that indicates the strength of a relationship between two variables. It is used to describe the extent to which two variables are related and how they vary together.

Correlation describes whether and how two variables change together. Positive correlation means both variables increase together; negative correlation means one increases while the other decreases. Common measures include:

  • Pearson correlation coefficient: a widely used measure that ranges from -1 to 1. A value of -1 indicates a perfect negative correlation, 1 indicates a perfect positive correlation, and 0 indicates no relationship between the variables.
  • Spearman rank correlation coefficient: used when variables are not normally distributed or when the relationship is non-linear. It is calculated by ranking the values of each variable and measuring differences between ranks. A value of 1 indicates a perfect positive correlation, while a value of -1 indicates a perfect negative correlation.

Throughout, it is important to remember that correlation does not necessarily imply causation: two correlated variables may not have a direct causal relationship and other factors may be responsible for the observed association.

As a person grows taller, their weight is likely to increase as well. Because a taller person’s body requires more energy and nutrients to support growth, the weight of a taller person is typically higher than that of a shorter person. This illustrates a positive correlation.

As a person’s income increases, the amount of money they spend on luxury goods also increases. Higher income typically provides more disposable income that can be used to buy luxury goods, an example of positive correlation.

Income and time spent watching television (cautionary example)

Section titled “Income and time spent watching television (cautionary example)”

There may be a positive correlation between a person’s income and the amount of time they spend watching television, but this does not necessarily mean income causes more television watching. Other factors such as lifestyle or personal preferences may explain the association.

  • Describing the extent and direction of relationships between two variables.
  • Identifying trends and patterns in data.
  • Correlation does not imply causation.
  • Use Spearman rank correlation when variables are not normally distributed or when the relationship may be non-linear.
  • Observed correlations can be influenced by other factors (for example, lifestyle or personal preferences) that do not reflect a direct causal link.
  • Pearson correlation coefficient
  • Spearman rank correlation coefficient
  • Positive correlation
  • Negative correlation
  • Causation