Book Guide¶

Use this guide to move between Machine Learning for Trading, Third Edition and ml4t-engineer without guessing which notebook maps to which production API.

ml4t-engineer appears in two distinct ways throughout the book:

pedagogical notebooks that build ideas step by step
reusable library workflows that collapse those ideas into stable APIs

The book teaches the method. The library is where you should go when you want to reuse that method across assets, case studies, and production pipelines.

How to Use This Guide¶

Start from the book if you want intuition, derivations, and plots. Start from the library guides if you want reusable implementations, validated APIs, and pipeline integration.

Use this page when you need to answer one of these questions:

"Which chapter teaches the concept behind this API?"
"Which notebook should I read before using this workflow in production?"
"Which library guide replaces the manual code from the book?"

Chapter Map¶

Chapter 3: Market Microstructure¶

Book path	What the book teaches	Library entry point	Docs page
`03_market_microstructure/code/08_itch_bar_sampling.py`	Why time bars are statistically weak and how tick, volume, and dollar bars improve sampling	`TickBarSampler`, `VolumeBarSampler`, `DollarBarSampler`	Alternative Bars
`03_market_microstructure/code/10_itch_information_bars.py`	Imbalance bars and threshold dynamics	`TickImbalanceBarSampler`, `FixedTickImbalanceBarSampler`, `FixedVolumeImbalanceBarSampler`	Alternative Bars
`03_market_microstructure/code/13_databento_bar_sampling.py`	Applying bar samplers to a modern vendor feed	Same sampler family with production input contracts	Alternative Bars

What changes when you move to the library:

the book focuses on the statistical motivation and diagnostics
the library gives you stable samplers, warnings, and reusable OHLCV outputs
fixed-threshold imbalance bars are the recommended production path

Chapter 7: Defining the Learning Task¶

Book path	What the book teaches	Library entry point	Docs page
`07_defining_learning_task/code/02_preprocessing_pipeline.py`	Leakage-safe preprocessing and split-aware scaling	`StandardScaler`, `MinMaxScaler`, `RobustScaler`, `PreprocessingPipeline`	Preprocessing
`07_defining_learning_task/code/03_label_methods.py`	Triple-barrier, percentile, trend-scanning, meta-labeling, and sample weighting	`LabelingConfig`, `triple_barrier_labels`, `rolling_percentile_binary_labels`, `trend_scanning_labels`, `meta_labels`	Labeling
`07_defining_learning_task/code/04_minimum_favorable_adverse_excursion.py`	Barrier behavior and excursion analysis	`LabelingConfig.triple_barrier()`	Labeling
`07_defining_learning_task/code/10_ml4t_library_ecosystem.py`	The library-oriented view of feature computation, discovery, and dataset building	`compute_features`, `feature_catalog`, `create_dataset_builder`	Features, Feature Discovery, Dataset Builder

What changes when you move to the library:

manual notebook experiments become serialized LabelingConfig workflows
feature discovery moves from ad hoc inspection to metadata-driven search
dataset preparation becomes train-only scaling and splitter-aware folds by default

Chapter 8: Feature Engineering¶

Book path	What the book teaches	Library entry point	Docs page
`08_feature_engineering/code/01_price_volume_features.py`	Momentum, trend, volatility, and volume feature intuition	`compute_features` with registry-backed indicators	Features
`08_feature_engineering/code/02_microstructure_features.py`	Microstructure feature construction and interpretation	Microstructure feature functions and registry metadata	Features
`08_feature_engineering/code/03_structural_cross_instrument_features.py`	Cross-asset and panel relationships	`ml4t.engineer.features.cross_asset`	Features
`08_feature_engineering/code/04_fundamentals_macro_calendar.py`	Lag features, calendar encodings, and ML-oriented transforms	ML feature utilities and preprocessing bridge	Features, ML Readiness

What changes when you move to the library:

the book derives and visualizes features individually
the library lets you request validated feature sets through one computation API
registry and catalog metadata help you choose features systematically

Chapter 9: Time-Series Analysis¶

Book path	What the book teaches	Library entry point	Docs page
`09_time_series_analysis/code/03_fractional_differencing.py`	The memory-stationarity tradeoff and ADF-based search for `d`	`ffdiff`, `find_optimal_d`, `fdiff_diagnostics`	Fractional Differencing
`09_time_series_analysis/code/08_garch_volatility.py`	Volatility estimators and conditional volatility modeling	Volatility feature family	Features
`09_time_series_analysis/code/09_har_rough_volatility.py`	Multi-horizon volatility structure	Volatility features used downstream in pipelines	Features
`09_time_series_analysis/code/11_hmm_regimes.py`	Regime detection workflows	Regime feature family	Features
`09_time_series_analysis/code/13_regime_as_feature.py`	Turning regimes into model inputs	Regime features in reusable pipelines	Features
`09_time_series_analysis/code/14_panel_features.py`	Cross-sectional panel features	Cross-asset feature functions for multi-asset inputs	Features

What changes when you move to the library:

research notebooks stay focused on method validation
library functions package the same transforms into repeatable feature pipelines
the case studies show how to combine these transforms with labels and CV

Case-Study Pipeline Map¶

Most case studies follow the same structure. ml4t-engineer is the handoff point between raw market data and model-ready datasets.

Case-study step	Typical file	Library workflow	Docs page
Labels	`case_studies/<study>/code/02_labels.py`	`LabelingConfig`, barrier labels, percentile labels, fixed-horizon labels	Labeling
Features	`case_studies/<study>/code/03_features.py`	`compute_features` plus feature-specific functions	Features
Temporal prep	`case_studies/<study>/code/04_temporal.py`	fractional differencing and leakage-safe preparation	Fractional Differencing, Dataset Builder

Examples called out in the current integration audit:

ETFs: percentile labels, production compute_features, fractional differencing
US Equities Panel: triple-barrier labels, panel features, fractional differencing
CME Futures: ATR-based barriers and futures-aware labeling workflows
NASDAQ-100 Microstructure: manual pedagogical implementations with strong overlap to production-ready microstructure features in this library

From Notebook Code to Library API¶

Use this translation when moving from the book to reusable code:

Book pattern	Library equivalent
Manually computing several indicators in sequence	`compute_features(data, feature_spec)`
Notebook-only feature browsing	`feature_catalog.list()`, `feature_catalog.search()`, `registry.get()`
Inline barrier parameters spread across a notebook	`LabelingConfig.triple_barrier(...)` or `LabelingConfig.atr_barrier(...)`
One-off train/test scaling	`create_dataset_builder(..., scaler=...)`
Manual stationarity experiments	`find_optimal_d()` then `ffdiff()`
Bar-construction experiments	sampler classes in `ml4t.engineer.bars`

Maturity and Scope¶

These are the workflows readers should prioritize:

production-ready: feature computation, labeling methods, alternative bars, fractional differencing, feature discovery, dataset builder
advanced: cross-asset feature workflows and pipeline orchestration
experimental or low-priority: DuckDB store and any workflows not yet used in the book or case studies

This matches the current audit: the strongest value in ml4t-engineer is the production API for feature engineering, labeling, and dataset preparation.

Where to Go Next¶

Start with Quickstart if you want a working example first.
Read Features for the core computation API.
Read Labeling if you are building supervised targets.
Read Alternative Bars for microstructure workflows.
Read Dataset Builder for leakage-safe model inputs.
Use API Reference for exact signatures.

Book Guide¶

How to Use This Guide¶

Recommended Reader Journey¶

Chapter Map¶

Chapter 3: Market Microstructure¶

Chapter 7: Defining the Learning Task¶

Chapter 8: Feature Engineering¶

Chapter 9: Time-Series Analysis¶

Case-Study Pipeline Map¶

From Notebook Code to Library API¶

Maturity and Scope¶

Where to Go Next¶