Chapter 8

Financial Feature Engineering

6 sections 8 notebooks 20 references Code

Learning Objectives

  • Translate a trading hypothesis into a documented feature specification using horizon alignment, driver hypothesis, and role separation.
  • Choose a feature's reference frame, representation, and aggregation to match the economic claim and execution horizon, and distinguish hypothesis-changing choices from noise-control choices.
  • Distinguish signal features from state variables and identify when each should be used marginally, as an interaction, or as a conditioning variable.
  • Design representative feature specifications across price-derived, structural and cross-instrument, and contextual data families, with explicit timing assumptions and failure modes.
  • Combine signals with state variables using gating, scaling, and conditional variants, and evaluate whether the interaction adds incremental information.
  • Apply point-in-time discipline to slow-moving and revised data, including reporting lags, event timing, and vintage-aware availability rules.
  • Control feature-search degrees of freedom using one-knob-at-a-time exploration, within-family deduplication, and multiple-testing-aware triage.
Figure 8.2
8.1

Capturing and Configuring the Economic Drivers

This section establishes the framework for turning strategy narratives into computable features through three filters: horizon alignment (matching lookback and aggregation to the label horizon), driver hypothesis (mapping to persistence, reversion, risk compensation, or predictable-clock mechanisms), and role separation (classifying each feature as a signal or a state variable). It introduces three configurable knobs — reference frame, representation, and aggregation — and requires a formal feature specification that records name, family, role, driver, inputs, lookback, failure modes, and observability constraints. The reader learns that a feature without a named mechanism and documented failure modes is an input awaiting a hypothesis, not a finished design.

8.2

Price-Derived Features

This section develops five feature families computed from an asset's own price, volume, and trade data: trend/momentum (cumulative returns, moving-average slopes, residual momentum), reversal (distance-to-anchor statistics with explicit anchor and normalization choices), volatility and tail risk (range-based estimators including Parkinson, Garman-Klass, and Yang-Zhang with 5-14x efficiency gains over close-to-close), liquidity/tradability proxies, and microstructure/order-flow features (OFI, Kyle's lambda, depth metrics). For each family, the section separates meaning-changing knobs from noise-reduction knobs and catalogues typical failure modes. The reader takes away a systematic vocabulary for price-derived features, concrete estimator selection guidance by case study, and the discipline of testing delay sensitivity before optimizing microstructure details.

2 notebooks

8.3

Structural and Cross-Instrument Features

This section covers three feature families that require data beyond a single price series: carry and term structure (roll yield, funding rates, curve shape), cross-asset relative value (peer-mean deviations, factor-neutral residuals, lead-lag dynamics), and options-implied features (ATM IV, implied-realized spread, risk reversals, term-structure slope, put-call skew). A worked example shows that SPY-TLT rolling correlation conditions momentum IC with a 17 percentage-point swing across regimes, demonstrating that cross-instrument features can carry more conditioning power than any single-asset signal. The reader learns that neutralization defines the hypothesis (not cleanup), that surface policy drift is a hidden structural break, and that lead-lag claims require careful validation — the section's own SPY-to-sector test yields an instructive negative result.

1 notebook

8.4

Contextual and Slow-Moving Features

This section covers three externally-sourced feature families — fundamentals/characteristics, calendar/event encodings, and macro/policy state — united by low update frequency, strict point-in-time requirements, and a primary role as conditioning variables for faster signals. It emphasizes that repeating quarterly values across daily rows inflates nominal sample size without adding information, and that reporting lags, revision policies, and vintage series are meaning-changing knobs. Calendar features use cyclical sin/cos encodings and time-to-event proximity, while macro state variables (yield-curve slope, credit spreads, VIX percentile) condition when trend and carry signals work. The reader learns that for these features, data integrity is the binding constraint and that point-in-time correctness is non-negotiable.

1 notebook

8.5

Cross-Cutting Feature Types and the Limits of Direct Aggregation

This section identifies two cross-cutting feature types that span families — learned representations (PCA, autoencoders, foundation-model embeddings) and flows/positioning proxies (CoT, on-chain data) — and argues that direct aggregation over trailing windows cannot capture conditional dynamics, latent states, cyclical structure, or path geometry. It marks the transition from deterministic features (Sections 8.2-8.4) to model-based features (Chapter 9), explaining that fitted objects such as GARCH, HMM, FFT, and path signatures extract structure invisible in raw series but require walk-forward discipline for point-in-time correctness.

8.6

Combining Features and Controlling Search

This section addresses how to combine signal features with state features through three interaction templates — gating (trade only when favorable), scaling (adjust exposure by state), and conditional variants (signal-in-regime) — and provides a worked example showing momentum IC decaying monotonically across volatility terciles. It prescribes degrees-of-freedom discipline: budget variants by family and role, vary one knob at a time, and deduplicate near-duplicates via hierarchical clustering before modeling. Three implementation choices that change the hypothesis rather than reduce noise — residualization, winsorization bounds, and fractional differencing order — are flagged as requiring walk-forward fitting. The reader learns that feature combination multiplies the searched set and must be accompanied by explicit trial-family accounting.

4 notebooks