Methodology Highlight
Demonstrates that data engineering choices — like continuous contract back-adjustment method — can affect results more than model selection, teaching the importance of data pipeline validation.

This case study uses daily data from Databento for 30 CME futures products across 7 sectors — equity indices, treasuries, energy, metals, currencies, agriculture, and livestock. Futures have a unique return decomposition (spot return plus roll yield), natural sector groupings, and inherent leverage that make them structurally different from equity cross-sections.
The central lesson is about data quality: how upstream data engineering choices cascade through the entire pipeline. The case study demonstrates how the method used to construct continuous contracts (ratio vs difference back-adjustment) changes model behavior — the same models on the same features produce qualitatively different results depending on this single preprocessing decision.
Students learn carry factor construction from term structure data, continuous contract building methods, and sector-grouped cross-sectional analysis. The case study teaches that model selection is secondary to data preparation when the data pipeline itself introduces systematic bias.

Strategy Summary

Long-short carry-ranked strategy across 30 CME products. Weekly Friday-close decisions with Monday-open execution. Carry (roll yield from term structure) is the primary signal, combined with momentum. The strategy exploits contango and backwardation patterns across 7 commodity sectors. Cost model includes commission, spread, and roll slippage.

Data Sources

Databento (CME futures)

ML Techniques

Carry factor from term structure Continuous contract construction GBM with sector features Data quality sensitivity analysis

ML Pipeline

Universe & Setup
1 notebook
30 CME futures products across 7 sectors (equity indices, treasuries, energy, metals, currencies, agriculture, livestock). Weekly Friday-close / Monday-open cadence. Uses `product` (not `symbol`) as entity identifier. Return decomposition into spot + roll components, with Panama back-adjustment for continuous series. 5 CV folds (8Y train, 1Y val). Commission, spread, and roll slippage costs.
Universe & Protocol Setup Ch 6
Defines the weekly Friday-close / Monday-open decision cadence for CME products across multiple sectors. Documents roll conventions, continuous series construction via back-adjustment, and return decomposition into spot and carry components. Builds walk-forward CV splits.
Labels & Evaluation
2 notebooks
5-day forward return from ratio back-adjusted continuous prices (primary), 21-day variant. Return decomposed into spot and carry components. Evaluation covers 71+ features with special focus on the back-adjustment sensitivity: ratio vs difference methods produce dramatically different IC landscapes. HAC-adjusted IC with FDR correction across 30 products and 7 sectors.
Label Engineering Ch 7
Constructs 5-day and 21-day forward return labels from continuous futures prices. Decomposes total return into spot and carry components via front-deferred spread. Generates walk-forward CV splits with purge and embargo buffers and computes ATR-scaled triple-barrier labels.
Feature Evaluation Ch 7
Evaluates all features against the 5-day forward return label using HAC- adjusted Information Coefficients with Benjamini-Hochberg FDR correction. Diagnoses feature shape via quantile monotonicity and redundancy via pairwise correlation across front-month products. Triages features as PROCEED / REVISE / STOP for downstream modeling.
Feature Engineering
2 notebooks
71 features unique to futures: carry (basis, roll yield, curve slope), term structure shape, cross-sectional carry normalization, seasonal features for agriculture and energy products, and roll proximity indicators. These carry and term structure features are unavailable in equity case studies. Temporal features from ARIMA for carry z-score mean-reversion, FFT for seasonal cycle detection, and HMM for portfolio-level carry regime switching.
Feature Engineering Ch 8
Builds futures-specific features: carry (basis, roll yield, curve slope, z-scores), term structure curvature, cross-sectional carry rank, momentum at multiple horizons, Yang-Zhang volatility, seasonal indicators for agriculture and energy, and roll proximity. Computes sector-conditional normalization across the multi-sector universe.
Temporal Features Ch 9
Fits three temporal model families using expanding-window walk-forward discipline: AutoARIMA on carry z-score per product for mean-reversion forecasts, rolling FFT for seasonal cycle detection in carry percentage, and a 2-state Gaussian HMM on portfolio-level carry for regime switching. Produces temporal features merged with the Ch8 feature matrix.
Modeling
8 notebooks
The strongest nonlinearity diagnostic in the book: linear models produce negative IC on the same features where GBM achieves +0.043. The carry signal exists only in feature interactions -- shallow trees (7 leaves) are optimal for 30 products. Three latent factor approaches: general extraction, SDF pricing kernel, and term structure PCA. Causal DML tests whether carry signal causes returns or proxies for risk. Switching back-adjustment method flips linear models from positive to negative and destroys causal significance entirely.
Linear Models Ch 11
Trains Ridge, LASSO, and ElasticNet on walk-forward folds across CME products using the `product` entity column. Stores per-fold coefficients and generates backtesting-ready predictions. Registers results to the model registry for cross-family comparison.
GBM Grid Search Ch 12
Searches LightGBM configurations across leaf-count profiles and objectives with IC evaluated at iteration checkpoints to detect overfitting. Trains on GPU with walk-forward folds across the multi-sector cross-section. Registers best checkpoint per config for downstream backtest.
Tabular DL (TabM) Ch 12
Trains TabM (rank-1 adapter MLP ensemble) in small/medium/large variants on walk-forward folds with IC checkpoints. Compares TabM's learned feature interactions against GBM's tree-based splits for the futures cross-section on futures returns. Registers predictions for backtest.
LSTM Ch 13
Trains LSTM with gated recurrence on 60-day lookback windows across CME products. Tests whether sequential memory captures carry regime transitions (contango to backwardation) from the temporal trajectory of term structure features. Loads prior linear and GBM baselines for comparison.
Latent Factor Models Ch 14
Runs CAE and related latent factor models via walk-forward CV on a balanced panel of CME products. Applies rank normalization and extracts latent factors, testing whether the multi-sector structure (equities, rates, energy, metals, currencies, agriculture, livestock) emerges as interpretable factors.
Stochastic Discount Factor Ch 14
Trains a stochastic discount factor network that maps observable futures characteristics to pricing weights satisfying the Euler equation. Uses a compact architecture suited to the futures panel. Compares SDF predictive IC against PCA and linear baselines via walk-forward CV.
Term Structure PCA Ch 14
Applies PCA to rank-normalized cross-sectional characteristics of CME futures products via walk-forward CV. Extracts and interprets level/slope/curvature-like factors across commodity sectors. Compares predictive IC at multiple horizons against linear and GBM baselines.
Causal DML Ch 15
Applies Double Machine Learning to carry_pct (treatment) across CME products with vol_21d, momentum_composite, and carry_rank as confounders. Runs walk-forward DML estimation with placebo permutations and classifies refutation robustness. Registers causal effect estimates to the model registry.
Strategy Pipeline
5 notebooks
Long-short carry-ranked strategy with equal-risk and score-weighted allocation. Sector concentration limits tested across the 7 natural groupings. Holdout Sharpe decays ~50% from +0.61 to +0.30. Commission, spread, and roll slippage impact quantified. Weekly cadence keeps turnover manageable.
Model Analysis Ch 11
Compares best-in-family IC across all model families (linear, GBM, TabM, LSTM, latent factors, causal DML) trained on CME futures. Evaluates checkpoint sensitivity, fold stability, prediction bucket monotonicity, and cross-family prediction correlation. Produces the ranking that determines which models advance to backtest.
Backtest & Signal Evaluation Ch 16
Runs plumbing test (random signal verification), then sweeps all (prediction x entry scheme) combinations across CME products at weekly cadence. Computes Deflated Sharpe Ratio for multiple-testing correction and visualizes the IC-vs-Sharpe scatter. Registers all signal-stage backtest results.
Portfolio Construction Ch 17
Sweeps top signal-stage predictions x TOP_K concentration levels x 6 allocators (equal-weight, score-weighted, inverse-vol, risk-parity, MVO, HRP) for CME products. Tests how allocator choice interacts with sector concentration in a narrow universe where high concentration is inherent.
Transaction Costs Ch 18
Sweeps a cost grid on top allocation-stage combinations for CME futures. Measures Sharpe decay from gross to net, identifies breakeven cost level, and tests viability at institutional futures execution costs. Weekly cadence limits turnover.
Risk Management Ch 19
Applies position-level (stop-loss, trailing stop, time exit) and portfolio- level (drawdown breaker, daily loss limit) risk controls on top allocation combos. Calibrates trailing stops via MAE/MFE analysis. Tests whether risk overlays improve the drawdown profile for a carry-driven weekly strategy.
Synthesis & Verdict
1 notebook
Data quality teaching case: a single upstream decision (back-adjustment method) cascades through IC, model selection, and causal significance. Verdict: Advance -- after confirming data quality. All 6 model families trained, providing complete comparative evidence alongside ETFs.
Strategy Analysis Ch 20
Synthesizes signal, allocation, cost, and risk results into a structured deployment verdict. Traces the champion strategy through all pipeline stages, computes search risk across the full backtest sweep, evaluates holdout degradation, and produces the per-case-study verdict consumed by Ch20.