Chapter 8

Financial Feature Engineering

6 sections 8 notebooks 20 references Code

Learning Objectives

Translate a trading hypothesis into a documented feature specification using horizon alignment, driver hypothesis, and role separation.
Choose a feature's reference frame, representation, and aggregation to match the economic claim and execution horizon, and distinguish hypothesis-changing choices from noise-control choices.
Distinguish signal features from state variables and identify when each should be used marginally, as an interaction, or as a conditioning variable.
Design representative feature specifications across price-derived, structural and cross-instrument, and contextual data families, with explicit timing assumptions and failure modes.
Combine signals with state variables using gating, scaling, and conditional variants, and evaluate whether the interaction adds incremental information.
Apply point-in-time discipline to slow-moving and revised data, including reporting lags, event timing, and vintage-aware availability rules.
Control feature-search degrees of freedom using one-knob-at-a-time exploration, within-family deduplication, and multiple-testing-aware triage.

8.1

Capturing and Configuring the Economic Drivers

This section establishes the framework for turning strategy narratives into computable features through three filters: horizon alignment (matching lookback and aggregation to the label horizon), driver hypothesis (mapping to persistence, reversion, risk compensation, or predictable-clock mechanisms), and role separation (classifying each feature as a signal or a state variable). It introduces three configurable knobs — reference frame, representation, and aggregation — and requires a formal feature specification that records name, family, role, driver, inputs, lookback, failure modes, and observability constraints. The reader learns that a feature without a named mechanism and documented failure modes is an input awaiting a hypothesis, not a finished design.

8.2

Price-Derived Features

This section develops five feature families computed from an asset's own price, volume, and trade data: trend/momentum (cumulative returns, moving-average slopes, residual momentum), reversal (distance-to-anchor statistics with explicit anchor and normalization choices), volatility and tail risk (range-based estimators including Parkinson, Garman-Klass, and Yang-Zhang with 5-14x efficiency gains over close-to-close), liquidity/tradability proxies, and microstructure/order-flow features (OFI, Kyle's lambda, depth metrics). For each family, the section separates meaning-changing knobs from noise-reduction knobs and catalogues typical failure modes. The reader takes away a systematic vocabulary for price-derived features, concrete estimator selection guidance by case study, and the discipline of testing delay sensitivity before optimizing microstructure details.

2 notebooks

8.3

Structural and Cross-Instrument Features

This section covers three feature families that require data beyond a single price series: carry and term structure (roll yield, funding rates, curve shape), cross-asset relative value (peer-mean deviations, factor-neutral residuals, lead-lag dynamics), and options-implied features (ATM IV, implied-realized spread, risk reversals, term-structure slope, put-call skew). A worked example shows that SPY-TLT rolling correlation conditions momentum IC with a 17 percentage-point swing across regimes, demonstrating that cross-instrument features can carry more conditioning power than any single-asset signal. The reader learns that neutralization defines the hypothesis (not cleanup), that surface policy drift is a hidden structural break, and that lead-lag claims require careful validation — the section's own SPY-to-sector test yields an instructive negative result.

1 notebook

8.4

Contextual and Slow-Moving Features

This section covers three externally-sourced feature families — fundamentals/characteristics, calendar/event encodings, and macro/policy state — united by low update frequency, strict point-in-time requirements, and a primary role as conditioning variables for faster signals. It emphasizes that repeating quarterly values across daily rows inflates nominal sample size without adding information, and that reporting lags, revision policies, and vintage series are meaning-changing knobs. Calendar features use cyclical sin/cos encodings and time-to-event proximity, while macro state variables (yield-curve slope, credit spreads, VIX percentile) condition when trend and carry signals work. The reader learns that for these features, data integrity is the binding constraint and that point-in-time correctness is non-negotiable.

1 notebook

8.5

Cross-Cutting Feature Types and the Limits of Direct Aggregation

This section identifies two cross-cutting feature types that span families — learned representations (PCA, autoencoders, foundation-model embeddings) and flows/positioning proxies (CoT, on-chain data) — and argues that direct aggregation over trailing windows cannot capture conditional dynamics, latent states, cyclical structure, or path geometry. It marks the transition from deterministic features (Sections 8.2-8.4) to model-based features (Chapter 9), explaining that fitted objects such as GARCH, HMM, FFT, and path signatures extract structure invisible in raw series but require walk-forward discipline for point-in-time correctness.

8.6

Combining Features and Controlling Search

This section addresses how to combine signal features with state features through three interaction templates — gating (trade only when favorable), scaling (adjust exposure by state), and conditional variants (signal-in-regime) — and provides a worked example showing momentum IC decaying monotonically across volatility terciles. It prescribes degrees-of-freedom discipline: budget variants by family and role, vary one knob at a time, and deduplicate near-duplicates via hierarchical clustering before modeling. Three implementation choices that change the hypothesis rather than reduce noise — residualization, winsorization bounds, and fractional differencing order — are flagged as requiring walk-forward fitting. The reader learns that feature combination multiplies the searched set and must be accompanied by explicit trial-family accounting.

4 notebooks

Related Case Studies

See where these chapter concepts get applied in end-to-end trading workflows.

All case studies

ETF Cross-Asset Exposures

All six model families compared across 100 ETFs spanning 9 asset classes

ETFs Daily

Crypto Perpetuals Funding

Alternative data and non-standard frequencies in 24/7 crypto markets

Cryptocurrency 8-Hour

NASDAQ-100 Microstructure

Intraday microstructure signals across 114 stocks at 15-minute frequency

Equities 15-Minute

S&P 500 Equity + Option Analytics

Combining options-derived features with equity data for multi-source prediction

Options Daily

US Firm Characteristics

Classic factor investing with ML on monthly fundamental data

Fundamentals Monthly

FX Spot Pairs

Momentum and carry factors in the world's most liquid market

Foreign Exchange Daily

CME Futures

Carry signals across 30 products — data quality as the critical variable

Futures Daily

S&P 500 Options (Straddles)

Direct options trading and why equity-style cost models fail for options

Options Daily

US Equities Panel

Large-scale cross-sectional prediction across 3,200 stocks with 16 walk-forward folds

Equities Daily

All Chapters