Learning Objectives
- Explain how boosting differs from bagging and why sequential error correction makes GBMs effective for financial
- Select among XGBoost, LightGBM, and CatBoost based on categorical structure, compute environment, latency needs, and
- Choose appropriate GBM objectives and constraints for financial tasks, including pointwise regression, learning to
- Tune GBMs efficiently with Optuna using pruning, multi-objective search, and time-series-aware validation
- Use TreeSHAP to analyze feature effects, interactions, instability, and drift in deployed tree-based models
- Evaluate when tabular deep learning alternatives such as TabPFN, TabM, and TabR are worth considering relative to GBMs
- Interpret cross-case-study evidence to decide when nonlinear tree models earn their added complexity relative to
From Decision Trees to Ensembles
The section builds the conceptual foundation for gradient boosting through three stages: how decision trees recursively partition feature space to capture nonlinear interactions (like momentum conditional on volatility), how Random Forests reduce variance by averaging decorrelated trees but cannot correct systematic bias, and why sequential error-correction through boosting is needed to address that limitation. It establishes Random Forests as the baseline that GBMs must demonstrably outperform to justify their additional complexity.
1 notebook
The Workhorse: Gradient Boosting Machines
This section covers the shared gradient boosting framework (Friedman 2001) and the distinctive innovations of XGBoost (regularized objective, sparsity-aware splits, second-order approximation), LightGBM (GOSS sampling, feature bundling, leaf-wise growth), and CatBoost (ordered target statistics to prevent categorical leakage, symmetric trees for fast inference). It provides a practical library comparison showing that accuracy gaps between libraries are smaller than gaps between good and bad hyperparameter configurations. The section also covers Learning to Rank via LambdaMART for strategies that trade only cross-sectional extremes, and monotonic constraints as theory-driven regularization that prevents economically implausible nonlinear artifacts.
1 notebook
Deep Learning Alternatives for Tabular Data
The section surveys the 2024-2026 landscape of deep learning for tabular data, covering tabular foundation models (TabPFN for zero-shot prototyping), parameter-efficient neural ensembles (TabM's rank-1 adapters achieving competitive performance with architectural simplicity), and retrieval-augmented models (TabR's nearest-neighbor hybridization with temporal leakage warnings). It synthesizes benchmark evidence from TabArena, TabReD, and others into a practical decision framework organized by data regime, noting the critical caveat that attention-heavy architectures degrade faster than GBMs under temporal distribution shift, and that most major benchmarks assume IID splits irrelevant to financial walk-forward validation.
1 notebook
Advanced Hyperparameter Tuning with Optuna
This section develops Optuna's Bayesian optimization for GBM's larger hyperparameter space, covering the TPE sampler, the define-by-run API with conditional parameters, and pruning strategies that can halve computation without sacrificing quality. It provides a GBM-specific tuning taxonomy (tree structure, boosting dynamics, regularization) with the practical insight that regularization parameters often have the largest impact on out-of-sample performance. The section also covers multi-objective optimization for the IC-turnover Pareto frontier and time-series-aware tuning protocols that prevent the subtle leakage of selecting hyperparameters with future information.
4 notebooks
Model Explainability with SHAP
The section extends SHAP from linear models (Chapter 11) to tree-based models via TreeSHAP, which computes exact Shapley values efficiently enough to run on every walk-forward fold as standard diagnostic infrastructure. It introduces TreeSHAP's unique capability for exact interaction decomposition, revealing that momentum's predictive power in the ETF case study is regime-conditional (collapsing when volatility exceeds the 90th percentile). The section develops SHAP-based drift monitoring as an early warning system that detects mechanism changes before they manifest in performance metrics, addresses the Rashomon effect (equally good models producing different explanations), and connects SHAP to conformal prediction for an integrated explainability-uncertainty feedback loop.
4 notebooks
GBMs Across Nine Asset Classes
Systematic evaluation across all nine case studies with 30+ experiments reveals five patterns: GBMs beat linear baselines in seven or eight of nine primary-label comparisons (with the largest gain in CME futures where nonlinear term-structure interactions dominate), shallow-to-moderate trees with MAE loss function win most comparisons, horizon and label specification affect IC as much as model choice (with winsorization gains often exceeding the GBM-vs-linear improvement), walk-forward validation generalizes to holdout data without catastrophic breakdowns, and TreeSHAP resolves disagreements between native gain and split-count importance metrics. TabM beats GBM on several case studies, indicating that tree-based inductive bias is not the only viable path.
1 notebook
Key Takeaways
Related Case Studies
See where these chapter concepts get applied in end-to-end trading workflows.
ETF Cross-Asset Exposures
All six model families compared across 100 ETFs spanning 9 asset classes
Crypto Perpetuals Funding
Alternative data and non-standard frequencies in 24/7 crypto markets
NASDAQ-100 Microstructure
Intraday microstructure signals across 114 stocks at 15-minute frequency
S&P 500 Equity + Option Analytics
Combining options-derived features with equity data for multi-source prediction
US Firm Characteristics
Classic factor investing with ML on monthly fundamental data
FX Spot Pairs
Momentum and carry factors in the world's most liquid market
CME Futures
Carry signals across 30 products — data quality as the critical variable
S&P 500 Options (Straddles)
Direct options trading and why equity-style cost models fail for options
US Equities Panel
Large-scale cross-sectional prediction across 3,200 stocks with 16 walk-forward folds