Probability Calibration

Whether a model's predicted probabilities mean what they say — and how to fix them with Platt scaling or isotonic regression when they don't.

A reliability diagram: points below the diagonal mean the model is overconfident

Platt scaling (fit a sigmoid) or isotonic regression (fit a monotonic step function) can pull the red curve back onto the diagonal.

Definition

A classifier is calibrated if its predicted probabilities mean what they say: among all the times it predicts "70% chance of rain," it should actually rain about 70% of the time. A model can be highly accurate (correctly classifying most examples) while being badly calibrated (its probability numbers are misleading) — accuracy and calibration measure different things.

Overconfidence in practice

A spam filter that labels everything it flags "99% spam," but is only actually right 80% of the time on those flagged emails, is overconfident — it's a useful classifier but a poor probability estimator. Calibration is about fixing that gap.

Try it

Could a classifier be perfectly calibrated but have poor accuracy?

Solution

Yes. A model that always predicts "50% chance" for everything, where the true base rate genuinely is 50%, is perfectly calibrated — its stated probability matches the long-run frequency — but it provides zero discriminative power between classes, so its accuracy (as a hard classifier) is no better than a coin flip.

Related concepts

Machine Learning· Supervised Learning

Logistic RegressionModelling the probability of a binary outcome using the sigmoid function — fitting by maximum likelihood or gradient descent.

Machine Learning· Model Training

Model EvaluationConfusion matrices, accuracy, precision, recall, F1 score, ROC curves, and AUC — the toolkit for measuring classifier and regressor performance.

Machine Learning· Supervised Learning

Naive BayesA probabilistic classifier that applies Bayes' theorem with the (often unrealistic) assumption that features are conditionally independent given the class.

Machine Learning· Supervised Learning

Decision TreesFlowchart-like models that recursively partition the feature space by asking yes/no questions — interpretable but prone to overfitting.