Chapter 11

The ML Pipeline

6 sections 20 notebooks 25 references Code

Learning Objectives

  • Choose between regression and classification formulations based on how predictions will be translated into trading decisions
  • Fit leakage-safe regularized linear models, including Ridge, LASSO, Elastic Net, and logistic regression, using point-in-time preprocessing and standardization
  • Tune and evaluate linear models with walk-forward validation, temporal buffers, and, when needed, nested cross-validation to reduce selection bias
  • Interpret model behavior with SHAP-based diagnostics to assess feature importance, economic plausibility, and stability across refits
  • Construct and evaluate conformal prediction intervals or prediction sets, and monitor where coverage degrades under non-stationary market conditions
  • Use cross-case-study evidence to judge when linear models provide a strong baseline and when weak linear signal motivates more flexible models
Figure 11.3
11.1

From Inference to Prediction

This section argues that the shift from econometric inference to predictive modeling changes which estimator properties matter: unbiasedness (the Gauss-Markov criterion) gives way to minimizing total prediction error via the bias-variance tradeoff. It develops the case through the two-cultures framework (Breiman 2001), showing how high dimensionality, multicollinearity, and low signal-to-noise ratios in financial features make unconstrained OLS unsuitable for forecasting. The section introduces Ridge, LASSO, and Elastic Net as principled responses that encode different structural priors about how signal is distributed across features, and maps the label families from Chapter 7 to regression and classification tasks.

1 notebook

11.2

Regularized Regression

The section presents Ridge, LASSO, and Elastic Net with their formal objectives, geometric intuitions, and distinctive behaviors on correlated financial features. It covers the full practical machinery required for deployment: standardization with leakage-safe protocols, hyperparameter optimization via Optuna with nested cross-validation to guard against validation overfitting, loss function choice (MSE, MAE, Huber, quantile), evaluation metrics (IC, ICIR, RMSE), and sample weighting for overlapping labels and recency adaptation. Empirical results on the ETF case study demonstrate that Ridge achieves a 1.5x ICIR improvement over OLS at optimal regularization, while LASSO degrades performance because the ETF signal is diffusely distributed across correlated features.

2 notebooks

11.3

Predicting Direction with Logistic Regression

This section develops logistic regression as the regularized classification baseline for discrete trading decisions, covering binary and multinomial formulations, probability calibration, and the conversion from predicted probabilities to trading signals via threshold-based, probability-weighted, and rank-based methods. It explains why regularization is even more critical for classification than regression (the maximum likelihood objective diverges under near-perfect separability) and addresses class imbalance through inverse-frequency weighting. The section provides practical guidance on when classification outperforms regression (binary trading decisions, noisy return targets) versus when it underperforms (asymmetric payoffs, magnitude-dependent position sizing).

1 notebook

11.4

Inside the Black Box: Model Interpretability with SHAP

The section establishes SHAP as the primary interpretability framework, grounded in Shapley values from cooperative game theory, which uniquely satisfy efficiency, symmetry, null-player, and linearity axioms. It develops a four-layer economic narrative protocol — sign consistency, magnitude plausibility, stability across walk-forward windows, and regime-conditional analysis — that transforms SHAP from a visualization tool into a continuous diagnostic for distinguishing genuine signal from overfitting. The section demonstrates both global feature importance and local waterfall explanations on the ETF case study, and addresses limitations including causal misinterpretation and impossible coalitions from marginalizing correlated features.

1 notebook

11.5

Quantifying Predictive Uncertainty

This section introduces conformal prediction as a distribution-free framework for constructing prediction intervals that wrap around any base model without distributional assumptions. It progresses from split-conformal prediction (fixed-width intervals with finite-sample marginal coverage guarantees) through Conformalized Quantile Regression (adaptive-width intervals) to Adaptive Conformal Inference (online coverage correction for non-stationary data). Empirical results on the ETF case study show that CQR+ACI progressively closes the conditional coverage gap during high-volatility periods (from 82.3% to 88.1% for a 90% target), and the section previews how interval width maps to position sizing in Chapter 19.

1 notebook

11.6

Linear Models Across Nine Case Studies

Running the same regularized pipeline across all nine case studies reveals that linear signal availability varies dramatically by asset class and market structure: delta-hedged options and ETFs show the strongest ICs, while CME futures and crypto are near zero on primary labels. The section demonstrates that label preprocessing (winsorization, horizon selection) often dominates model selection in its effect on IC, that Ridge wins the clear majority of primary-label comparisons due to the pervasive correlation structure of financial features, and that classification can outperform raw-return regression on noisy targets. A pedagogical backtest shows that Ridge's IC advantage over simple momentum is eroded by higher turnover, establishing the IC-cost tradeoff as the central tension for Chapters 16-19.

2 notebooks