3rd Edition

ML Primer

Standalone foundations in machine learning, statistics, and quantitative methods. Reference materials that complement the workflow chapters.

61
Primers
23
Chapters
9
Model-Based Feature Extraction
8 primers
Autoregressive, Moving-Average, and ARIMA Foundations for Feature Engineering
ARIMA is rarely the star predictor in liquid markets, but it is still one of the cleanest ways to separate level, persistence, shock, and forecast uncertainty before downstream models take over.
Bayesian Inference and MCMC for Time Series
A Bayesian time-series model produces a posterior distribution, not just a fitted line, which is why posterior uncertainty can itself become a feature.
Fractional Differencing and Long Memory in Financial Features
Fractional differencing is easy to apply but harder to understand well. This primer covers the operator algebra, asymptotic weight decay, and the precise sense in which the transform preserves low-frequency dependence.
Path Signatures and Log-Signatures for Financial Sequences
Path signatures encode the ordered geometry of multivariate sequences through iterated integrals. This primer covers the algebra, Chen-style composition, and the embedding choices that decide whether the construction carries real information in finance.
State-Space Models and the Kalman Filter
Kalman filter outputs are widely used as trading features. This primer covers the deeper machinery underneath them: the innovation representation, Riccati recursion, and identification choices that determine what those features actually mean.
Structural Break Diagnostics and Time-Since-Break Features
A break test is not asking whether the series is "bad." It is asking whether one stable model is still a reasonable description of the whole sample.
Uncertainty as a Feature: Stochastic Volatility, Forecast Intervals, and Forecast Uncertainty
In trading, two models with the same point forecast are not equivalent if one is much less certain than the other.
Wavelets for Multi-Scale Diagnostics and Causal Feature Design
Wavelets are often best used to discover where the signal lives, then translated into safer causal proxies, rather than deployed naively as production features.

Continue Learning

Apply these foundations in the ML4T workflow