Calibration

TL;DR

Adjusts model outputs so predicted probabilities become reliable for decision making.
Assessed with a calibration curve (predicted vs true probabilities); an ideal curve lies on the 45-degree line.
Common calibration methods include Platt scaling and isotonic regression.

Definition

Calibration is the process of adjusting a model so that its predictions are accurate and reliable — specifically, ensuring that a model’s predicted probabilities match the true probabilities of the events it predicts.

Explanation

Calibration checks whether a model’s reported probabilities correspond to observed frequencies. Without calibration, a model may be systematically too confident or not confident enough, which can harm decisions based on those probabilities. A calibration curve plots predicted probabilities against true event probabilities; a well-calibrated model’s curve is close to the 45-degree line. Methods for calibration depend on model type and output:

Platt scaling: fits a sigmoid function to model outputs, mapping them to the probability range 0 to 1. Applicable to models that output continuous values (example models mentioned: logistic regression or support vector machines).
Isotonic regression: fits a monotonic (nondecreasing) function to model outputs. Applicable to models that output continuous or discrete values (example models mentioned: decision trees or random forests).

Examples

Coin flip probability

A calibrated model predicting the probability that a coin lands on heads would output 0.5 (or 50%), because that is the true probability. An uncalibrated model might predict 0.6 (or 60%), indicating it is more confident than warranted.

Calibration curve behavior

A calibration curve plots predicted probabilities against true probabilities. A well-calibrated model’s curve lies close to the 45-degree line; a curve above or below that line indicates predicted probabilities are systematically too high or too low.

Use cases

Model evaluation: assessing and correcting probability estimates to ensure predictions can be trusted for decision making.

Notes or pitfalls

If a model is uncalibrated it may be overly confident or insufficiently confident, which can lead to poor decision making.

Calibration curve
Platt scaling
Isotonic regression
Logistic regression
Support vector machines
Decision trees
Random forests