Probability Calibration
Whether a model's predicted probabilities mean what they say — and how to fix them with Platt scaling or isotonic regression when they don't.
Platt scaling (fit a sigmoid) or isotonic regression (fit a monotonic step function) can pull the red curve back onto the diagonal.
A classifier is calibrated if its predicted probabilities mean what they say: among all the times it predicts "70% chance of rain," it should actually rain about 70% of the time. A model can be highly accurate (correctly classifying most examples) while being badly calibrated (its probability numbers are misleading) — accuracy and calibration measure different things.
A spam filter that labels everything it flags "99% spam," but is only actually right 80% of the time on those flagged emails, is overconfident — it's a useful classifier but a poor probability estimator. Calibration is about fixing that gap.
Could a classifier be perfectly calibrated but have poor accuracy?
Solution
Yes. A model that always predicts "50% chance" for everything, where the true base rate genuinely is 50%, is perfectly calibrated — its stated probability matches the long-run frequency — but it provides zero discriminative power between classes, so its accuracy (as a hard classifier) is no better than a coin flip.