Skip to content

Classification Matrix

  • A tabular tool (often 2x2) that compares predicted vs actual class labels to evaluate a classifier.
  • Summarizes counts of true positives, false positives, true negatives, and false negatives.
  • Enables calculation of metrics like accuracy, precision, recall, and F1 score and helps identify error patterns (e.g., many false positives).

A classification matrix, also known as a confusion matrix, is a tool used in machine learning and data mining to evaluate the performance of a classification model. It is a table that presents the predicted and actual values of a classification model in a tabular format, allowing for interpretation and analysis of the model’s performance. The classification matrix is typically used in binary classification and is commonly presented as a 2x2 table with four cells representing combinations of predicted and actual values.

The matrix displays how often the model’s predictions match the actual labels by counting occurrences for each outcome:

  • True positive (TP): predicted positive and actually positive.
  • False positive (FP): predicted positive but actually negative.
  • True negative (TN): predicted negative and actually negative.
  • False negative (FN): predicted negative but actually positive.

From these counts you can compute several performance metrics. For example, accuracy is the proportion of correct predictions:

accuracy=TP+TNTP+TN+FP+FN\text{accuracy} = \frac{TP + TN}{TP + TN + FP + FN}

The classification matrix also helps identify where a model errs; for instance, a high number of false positives may indicate a need to adjust the model’s decision threshold.

  • True positive (TP): The model correctly predicts that the patient has cancer.
  • False positive (FP): The model incorrectly predicts that the patient has cancer.
  • True negative (TN): The model correctly predicts that the patient does not have cancer.
  • False negative (FN): The model incorrectly predicts that the patient does not have cancer.
  • Evaluating the performance of binary classification models.
  • Calculating metrics such as accuracy, precision, recall, and F1 score.
  • Identifying areas for model improvement (for example, diagnosing an excess of false positives).
  • A high number of false positives may require adjusting the model’s threshold to reduce incorrect positive predictions.
  • Confusion matrix (alternate name)
  • Accuracy
  • Precision
  • Recall
  • F1 score