Statistical Methods¶

ml4t-diagnostic implements rigorous statistical methods from the academic literature to address the specific challenges of evaluating ML-based trading strategies. Each method page explains the problem it solves, the mathematics behind it, and how to use it in practice.

Use this section when you want the statistical rationale and assumptions behind a workflow. If you are looking for the fastest path to a working implementation, start with the User Guide and return here when you need method-level justification.

Why These Methods Matter¶

Standard evaluation metrics (Sharpe ratio, accuracy, R-squared) are misleading when applied naively to trading strategies. The core problems:

Multiple testing: Testing many strategies inflates the best result
Temporal dependence: Financial time series are autocorrelated
Information leakage: Forward-looking labels contaminate train/test splits

The methods below address these problems with mathematical rigor.

Method Overview¶

Method	Problem Solved	Key Function	Reference
Deflated Sharpe Ratio	Selection bias from testing many strategies	`deflated_sharpe_ratio()`	Lopez de Prado et al. (2025)
CPCV	Backtest overfitting detection	`CombinatorialCV`	Lopez de Prado (2018)
HAC-adjusted IC	Autocorrelation in IC significance testing	`compute_ic_hac_stats()`	Newey & West (1987)

Methods by Category¶

Multiple Testing Corrections¶

Method	When to Use	Computational Cost
DSR	Quick assessment of best strategy	O(1)
RAS	Correlated strategies, rigorous bounds	O(n_sim x T x N)
FDR	Screening many p-values	O(N log N)
Holm-Bonferroni	Confirmatory analysis, no false positives	O(N log N)

Cross-Validation¶

Method	When to Use	# Paths
WalkForwardCV	Standard time-series validation	N folds
CombinatorialCV	Backtest overfitting detection	C(N,k) paths

Information Coefficient Analysis¶

Method	When to Use	Handles Autocorrelation?
Naive IC	Quick signal assessment	No
HAC-adjusted IC	Publication-grade significance	Yes (Newey-West)
Bootstrap IC	Non-parametric inference	Yes (stationary bootstrap)

Decision Flowchart¶

Is your Sharpe ratio "too good to be true"?
├── Yes → How many strategies did you test?
│   ├── Known → Deflated Sharpe Ratio (DSR)
│   ├── Many & correlated → Rademacher Anti-Serum (RAS)
│   └── Many & independent → False Discovery Rate (FDR)
└── No → Is your IC significant?
    ├── Check with HAC-adjusted IC
    └── Validate with CPCV backtest paths

See It In The Book¶

The book introduces these methods in the validation chapters and then reuses them throughout the case studies:

Ch06 for walk-forward validation and CPCV
Ch07 for HAC-adjusted IC, DSR, and related significance testing
Ch08-Ch09 for feature triage and robustness workflows

Use the Book Guide for the notebook and case-study map.

Next Steps¶

Cross-Validation -- apply CPCV and walk-forward validation in practice
Statistical Tests Guide -- see how these methods fit a broader testing workflow
Four-Tier Validation -- place each method inside the full validation stack
Academic References -- review the underlying papers and citations