Chapter 12

Advanced Models for Tabular Data

7 sections 12 notebooks 16 references Code

Learning Objectives

Explain how boosting differs from bagging and why sequential error correction makes GBMs effective for financial
Select among XGBoost, LightGBM, and CatBoost based on categorical structure, compute environment, latency needs, and
Choose appropriate GBM objectives and constraints for financial tasks, including pointwise regression, learning to
Tune GBMs efficiently with Optuna using pruning, multi-objective search, and time-series-aware validation
Use TreeSHAP to analyze feature effects, interactions, instability, and drift in deployed tree-based models
Evaluate when tabular deep learning alternatives such as TabPFN, TabM, and TabR are worth considering relative to GBMs
Interpret cross-case-study evidence to decide when nonlinear tree models earn their added complexity relative to

12.1

From Decision Trees to Ensembles

The section builds the conceptual foundation for gradient boosting through three stages: how decision trees recursively partition feature space to capture nonlinear interactions (like momentum conditional on volatility), how Random Forests reduce variance by averaging decorrelated trees but cannot correct systematic bias, and why sequential error-correction through boosting is needed to address that limitation. It establishes Random Forests as the baseline that GBMs must demonstrably outperform to justify their additional complexity.

1 notebook

12.2

The Workhorse: Gradient Boosting Machines

This section covers the shared gradient boosting framework (Friedman 2001) and the distinctive innovations of XGBoost (regularized objective, sparsity-aware splits, second-order approximation), LightGBM (GOSS sampling, feature bundling, leaf-wise growth), and CatBoost (ordered target statistics to prevent categorical leakage, symmetric trees for fast inference). It provides a practical library comparison showing that accuracy gaps between libraries are smaller than gaps between good and bad hyperparameter configurations. The section also covers Learning to Rank via LambdaMART for strategies that trade only cross-sectional extremes, and monotonic constraints as theory-driven regularization that prevents economically implausible nonlinear artifacts.

1 notebook

12.3

Deep Learning Alternatives for Tabular Data

The section surveys the 2024-2026 landscape of deep learning for tabular data, covering tabular foundation models (TabPFN for zero-shot prototyping), parameter-efficient neural ensembles (TabM's rank-1 adapters achieving competitive performance with architectural simplicity), and retrieval-augmented models (TabR's nearest-neighbor hybridization with temporal leakage warnings). It synthesizes benchmark evidence from TabArena, TabReD, and others into a practical decision framework organized by data regime, noting the critical caveat that attention-heavy architectures degrade faster than GBMs under temporal distribution shift, and that most major benchmarks assume IID splits irrelevant to financial walk-forward validation.

1 notebook

12.4

Advanced Hyperparameter Tuning with Optuna

This section develops Optuna's Bayesian optimization for GBM's larger hyperparameter space, covering the TPE sampler, the define-by-run API with conditional parameters, and pruning strategies that can halve computation without sacrificing quality. It provides a GBM-specific tuning taxonomy (tree structure, boosting dynamics, regularization) with the practical insight that regularization parameters often have the largest impact on out-of-sample performance. The section also covers multi-objective optimization for the IC-turnover Pareto frontier and time-series-aware tuning protocols that prevent the subtle leakage of selecting hyperparameters with future information.

4 notebooks

12.5

Model Explainability with SHAP

The section extends SHAP from linear models (Chapter 11) to tree-based models via TreeSHAP, which computes exact Shapley values efficiently enough to run on every walk-forward fold as standard diagnostic infrastructure. It introduces TreeSHAP's unique capability for exact interaction decomposition, revealing that momentum's predictive power in the ETF case study is regime-conditional (collapsing when volatility exceeds the 90th percentile). The section develops SHAP-based drift monitoring as an early warning system that detects mechanism changes before they manifest in performance metrics, addresses the Rashomon effect (equally good models producing different explanations), and connects SHAP to conformal prediction for an integrated explainability-uncertainty feedback loop.

4 notebooks

12.6

GBMs Across Nine Asset Classes

Systematic evaluation across all nine case studies with 30+ experiments reveals five patterns: GBMs beat linear baselines in seven or eight of nine primary-label comparisons (with the largest gain in CME futures where nonlinear term-structure interactions dominate), shallow-to-moderate trees with MAE loss function win most comparisons, horizon and label specification affect IC as much as model choice (with winsorization gains often exceeding the GBM-vs-linear improvement), walk-forward validation generalizes to holdout data without catastrophic breakdowns, and TreeSHAP resolves disagreements between native gain and split-count importance metrics. TabM beats GBM on several case studies, indicating that tree-based inductive bias is not the only viable path.

1 notebook

12.7

Key Takeaways

Related Case Studies

See where these chapter concepts get applied in end-to-end trading workflows.

All case studies

ETF Cross-Asset Exposures

All six model families compared across 100 ETFs spanning 9 asset classes

ETFs Daily

Crypto Perpetuals Funding

Alternative data and non-standard frequencies in 24/7 crypto markets

Cryptocurrency 8-Hour

NASDAQ-100 Microstructure

Intraday microstructure signals across 114 stocks at 15-minute frequency

Equities 15-Minute

S&P 500 Equity + Option Analytics

Combining options-derived features with equity data for multi-source prediction

Options Daily

US Firm Characteristics

Classic factor investing with ML on monthly fundamental data

Fundamentals Monthly

FX Spot Pairs

Momentum and carry factors in the world's most liquid market

Foreign Exchange Daily

CME Futures

Carry signals across 30 products — data quality as the critical variable

Futures Daily

S&P 500 Options (Straddles)

Direct options trading and why equity-style cost models fail for options

Options Daily

US Equities Panel

Large-scale cross-sectional prediction across 3,200 stocks with 16 walk-forward folds

Equities Daily

All Chapters

Advanced Models for Tabular Data

Learning Objectives

From Decision Trees to Ensembles

The Workhorse: Gradient Boosting Machines

Deep Learning Alternatives for Tabular Data

Advanced Hyperparameter Tuning with Optuna

Model Explainability with SHAP

GBMs Across Nine Asset Classes

Key Takeaways

Related Case Studies

ETF Cross-Asset Exposures

Crypto Perpetuals Funding

NASDAQ-100 Microstructure

S&P 500 Equity + Option Analytics

US Firm Characteristics

FX Spot Pairs

CME Futures

S&P 500 Options (Straddles)

US Equities Panel

Bayesian Hyperparameter Optimization Under Temporal Dependence

Leakage-Safe Categorical Encoding for Financial ML