Chapter 7

Defining the Learning Task

5 sections 10 notebooks 18 references Code

Learning Objectives

  • Build split-aware preprocessing pipelines that produce stable, auditable inputs for label and feature computation.
  • Define execution-consistent labels, including fixed-horizon and event-style constructions, and diagnose overlap, resolution behavior, and implied trading intensity.
  • Evaluate feature-label bundles fold by fold using appropriate diagnostics for continuous and discrete targets, including stability, shape, and feasibility.
  • Screen candidates for implementation feasibility using turnover, break-even cost, and liquidity or capacity checks.
  • Account for search bias by defining searched sets, separating exploration from confirmation, and applying appropriate multiple-testing adjustments to fold-level summaries.
  • Use mechanism plausibility checks to distinguish potentially stable signal channels from confounded proxies, timing artifacts, and aggregation effects.
Figure 7.1
7.1

Data Preprocessing and Encodings

4 notebooks

7.2

Label Engineering

2 notebooks

7.3

Univariate Feature-Label Evaluation

2 notebooks

7.4

Search Accounting and Multiple Testing

1 notebook

7.5

From Correlation to Causality

1 notebook