7. References and Further Study

Use these references to deepen specific topics after working through the handbook.

Random Forests Leo Breiman’s original random forest paper.
Extremely randomized trees The original ExtraTrees paper by Geurts, Ernst, and Wehenkel.
Greedy Function Approximation: A Gradient Boosting Machine Jerome Friedman’s classic gradient boosting paper.
Understanding Random Forests: From Theory to Practice A practical and readable reference for tree ensembles, bias-variance intuition, and random forest variants.

Tabular Data: Deep Learning is Not All You Need A useful benchmark-style paper showing that XGBoost often outperformed several deep tabular models on the evaluated datasets while also requiring less tuning effort.
Why do tree-based models still outperform deep learning on typical tabular data? A strong reference for the argument that tree-based models remain the default baseline on many medium-sized tabular tasks, and for the more detailed discussion of why tabular neural networks struggle.

Take one real tabular workflow from your own work and compare:

Then write down:

That comparison usually teaches more than reading another round of definitions.

Last updated on Sat, Mar 14, 2026