Chapter 11

The ML Pipeline

6 sections 20 notebooks 25 references Code

Learning Objectives

Choose between regression and classification formulations based on how predictions will be translated into trading decisions
Fit leakage-safe regularized linear models, including Ridge, LASSO, Elastic Net, and logistic regression, using point-in-time preprocessing and standardization
Tune and evaluate linear models with walk-forward validation, temporal buffers, and, when needed, nested cross-validation to reduce selection bias
Interpret model behavior with SHAP-based diagnostics to assess feature importance, economic plausibility, and stability across refits
Construct and evaluate conformal prediction intervals or prediction sets, and monitor where coverage degrades under non-stationary market conditions
Use cross-case-study evidence to judge when linear models provide a strong baseline and when weak linear signal motivates more flexible models

11.1

From Inference to Prediction

This section argues that the shift from econometric inference to predictive modeling changes which estimator properties matter: unbiasedness (the Gauss-Markov criterion) gives way to minimizing total prediction error via the bias-variance tradeoff. It develops the case through the two-cultures framework (Breiman 2001), showing how high dimensionality, multicollinearity, and low signal-to-noise ratios in financial features make unconstrained OLS unsuitable for forecasting. The section introduces Ridge, LASSO, and Elastic Net as principled responses that encode different structural priors about how signal is distributed across features, and maps the label families from Chapter 7 to regression and classification tasks.

1 notebook

11.2

Regularized Regression

The section presents Ridge, LASSO, and Elastic Net with their formal objectives, geometric intuitions, and distinctive behaviors on correlated financial features. It covers the full practical machinery required for deployment: standardization with leakage-safe protocols, hyperparameter optimization via Optuna with nested cross-validation to guard against validation overfitting, loss function choice (MSE, MAE, Huber, quantile), evaluation metrics (IC, ICIR, RMSE), and sample weighting for overlapping labels and recency adaptation. Empirical results on the ETF case study demonstrate that Ridge achieves a 1.5x ICIR improvement over OLS at optimal regularization, while LASSO degrades performance because the ETF signal is diffusely distributed across correlated features.

2 notebooks

11.3

Predicting Direction with Logistic Regression

This section develops logistic regression as the regularized classification baseline for discrete trading decisions, covering binary and multinomial formulations, probability calibration, and the conversion from predicted probabilities to trading signals via threshold-based, probability-weighted, and rank-based methods. It explains why regularization is even more critical for classification than regression (the maximum likelihood objective diverges under near-perfect separability) and addresses class imbalance through inverse-frequency weighting. The section provides practical guidance on when classification outperforms regression (binary trading decisions, noisy return targets) versus when it underperforms (asymmetric payoffs, magnitude-dependent position sizing).

1 notebook

11.4

Inside the Black Box: Model Interpretability with SHAP

The section establishes SHAP as the primary interpretability framework, grounded in Shapley values from cooperative game theory, which uniquely satisfy efficiency, symmetry, null-player, and linearity axioms. It develops a four-layer economic narrative protocol — sign consistency, magnitude plausibility, stability across walk-forward windows, and regime-conditional analysis — that transforms SHAP from a visualization tool into a continuous diagnostic for distinguishing genuine signal from overfitting. The section demonstrates both global feature importance and local waterfall explanations on the ETF case study, and addresses limitations including causal misinterpretation and impossible coalitions from marginalizing correlated features.

1 notebook

11.5

Quantifying Predictive Uncertainty

This section introduces conformal prediction as a distribution-free framework for constructing prediction intervals that wrap around any base model without distributional assumptions. It progresses from split-conformal prediction (fixed-width intervals with finite-sample marginal coverage guarantees) through Conformalized Quantile Regression (adaptive-width intervals) to Adaptive Conformal Inference (online coverage correction for non-stationary data). Empirical results on the ETF case study show that CQR+ACI progressively closes the conditional coverage gap during high-volatility periods (from 82.3% to 88.1% for a 90% target), and the section previews how interval width maps to position sizing in Chapter 19.

1 notebook

11.6

Linear Models Across Nine Case Studies

Running the same regularized pipeline across all nine case studies reveals that linear signal availability varies dramatically by asset class and market structure: delta-hedged options and ETFs show the strongest ICs, while CME futures and crypto are near zero on primary labels. The section demonstrates that label preprocessing (winsorization, horizon selection) often dominates model selection in its effect on IC, that Ridge wins the clear majority of primary-label comparisons due to the pervasive correlation structure of financial features, and that classification can outperform raw-return regression on noisy targets. A pedagogical backtest shows that Ridge's IC advantage over simple momentum is eroded by higher turnover, establishing the IC-cost tradeoff as the central tension for Chapters 16-19.

2 notebooks

Related Case Studies

See where these chapter concepts get applied in end-to-end trading workflows.

All case studies

ETF Cross-Asset Exposures

All six model families compared across 100 ETFs spanning 9 asset classes

ETFs Daily

Crypto Perpetuals Funding

Alternative data and non-standard frequencies in 24/7 crypto markets

Cryptocurrency 8-Hour

NASDAQ-100 Microstructure

Intraday microstructure signals across 114 stocks at 15-minute frequency

Equities 15-Minute

S&P 500 Equity + Option Analytics

Combining options-derived features with equity data for multi-source prediction

Options Daily

US Firm Characteristics

Classic factor investing with ML on monthly fundamental data

Fundamentals Monthly

FX Spot Pairs

Momentum and carry factors in the world's most liquid market

Foreign Exchange Daily

CME Futures

Carry signals across 30 products — data quality as the critical variable

Futures Daily

S&P 500 Options (Straddles)

Direct options trading and why equity-style cost models fail for options

Options Daily

US Equities Panel

Large-scale cross-sectional prediction across 3,200 stocks with 16 walk-forward folds

Equities Daily

All Chapters

The ML Pipeline

Learning Objectives

From Inference to Prediction

Regularized Regression

Predicting Direction with Logistic Regression

Inside the Black Box: Model Interpretability with SHAP

Quantifying Predictive Uncertainty

Linear Models Across Nine Case Studies

Related Case Studies

ETF Cross-Asset Exposures

Crypto Perpetuals Funding

NASDAQ-100 Microstructure

S&P 500 Equity + Option Analytics

US Firm Characteristics

FX Spot Pairs

CME Futures

S&P 500 Options (Straddles)

US Equities Panel

Classical Statistical Tests as Linear Models: OLS, t-Tests, ANOVA, and Correlation

Loss Functions, Error Metrics, and What They Hide