Ensemble Methods

Combining many weak models into one strong one β€” bagging reduces variance, boosting reduces bias, and random forests and stacking blend both ideas.

Ensembles combine diverse imperfect models so individual errors cancel out
same task, different model viewsT1T2T3T4T53 of 5 agree

Core idea

The ensemble is strongest when models make different errors. Five identical models are just one model repeated.

Voting handles classification; averaging handles regression or probabilities.

Bagging(random forests): train on random subsets in parallel β€” reduces variance.
Boosting(AdaBoost, gradient boosting): train sequentially, each fixing the last one’s errors β€” reduces bias.
Definition

An ensemble method combines predictions from multiple models ("weak learners") into one stronger prediction. The core insight: if individual models make different mistakes, averaging or voting cancels many of those mistakes out β€” the group is more reliable than any single member.

The two dominant families are bagging (train many models independently and average/vote) and boosting (train models sequentially, each one focusing on what the previous ones got wrong).

Why averaging helps

Imagine five decision trees, each 70% accurate but making different errors on different examples. If their errors are independent, majority vote can be substantially more than 70% accurate β€” wrong answers from different trees rarely line up on the same examples.

Try it

Would an ensemble of five identical models (always agreeing) be expected to outperform a single one of them?

Solution

No. If the models always make the same predictions, voting changes nothing β€” there's no diversity of errors to cancel out. Ensembling only helps when the individual models are reasonably accurate and make somewhat independent mistakes.

Related concepts