3rd Edition
ML Primer
Standalone foundations in machine learning, statistics, and quantitative methods. Reference materials that complement the workflow chapters.
61
Primers
23
Chapters
3
Market Microstructure
1 primer
4
Fundamental and Alternative Data
3 primers
Time-Valid Security Masters and Identifier Histories
An identifier match is only useful if it resolves the right object at the right time.
Vintage Macroeconomic Data and Release-Calendar Alignment
A macro series is not known when its reference period ends. It is known when the release becomes public, and revised again when later vintages arrive.
XBRL Fundamentals in Practice
XBRL is not just tagged accounting data. It is the grammar that determines what a filing fact means, when it applies, and whether it can be compared across firms and time.
5
Synthetic Financial Data
2 primers
Bootstrap Methods for Dependent Financial Time Series
Bootstrap paths are useful only if they preserve the dependence structure your downstream metric actually cares about.
Stochastic Volatility, Jumps, and GARCH as Financial Simulation Baselines
A simulation baseline is useful when you know exactly which stylized facts it can generate and which ones it cannot.
8
Financial Feature Engineering
4 primers
Carry, Basis, and Roll Yield Across Futures and Perpetuals
Carry features measure the term-structure conditions under which holding or rolling exposure may be favorable or costly. In derivatives, those conditions appear through spot basis, calendar roll, and perpetual funding.
Point-in-Time Feature Construction and Data Vintages
A feature is only valid for trading if it was knowable at the decision timestamp, not merely true in hindsight.
Range-Based Volatility Estimators from OHLC Data
High and low prices reveal intrabar dispersion that the closing price alone cannot, but each OHLC estimator is only better when its assumptions match the bar structure you actually trade.
Residualization, Peer Sets, and Relative-Value Features
Neutralization is not cosmetic cleanup. It changes the hypothesis about what should count as an opportunity.
9
Model-Based Feature Extraction
8 primers
Autoregressive, Moving-Average, and ARIMA Foundations for Feature Engineering
ARIMA is rarely the star predictor in liquid markets, but it is still one of the cleanest ways to separate level, persistence, shock, and forecast uncertainty before downstream models take over.
Bayesian Inference and MCMC for Time Series
A Bayesian time-series model produces a posterior distribution, not just a fitted line, which is why posterior uncertainty can itself become a feature.
Fractional Differencing and Long Memory in Financial Features
Fractional differencing is easy to apply but harder to understand well. This primer covers the operator algebra, asymptotic weight decay, and the precise sense in which the transform preserves low-frequency dependence.
Path Signatures and Log-Signatures for Financial Sequences
Path signatures encode the ordered geometry of multivariate sequences through iterated integrals. This primer covers the algebra, Chen-style composition, and the embedding choices that decide whether the construction carries real information in finance.
State-Space Models and the Kalman Filter
Kalman filter outputs are widely used as trading features. This primer covers the deeper machinery underneath them: the innovation representation, Riccati recursion, and identification choices that determine what those features actually mean.
Structural Break Diagnostics and Time-Since-Break Features
A break test is not asking whether the series is "bad." It is asking whether one stable model is still a reasonable description of the whole sample.
Uncertainty as a Feature: Stochastic Volatility, Forecast Intervals, and Forecast Uncertainty
In trading, two models with the same point forecast are not equivalent if one is much less certain than the other.
Wavelets for Multi-Scale Diagnostics and Causal Feature Design
Wavelets are often best used to discover where the signal lives, then translated into safer causal proxies, rather than deployed naively as production features.
10
Text Feature Engineering
3 primers
Coverage-Aware Evaluation and Event-Time Alignment for Text Signals
A text model is not useful because it predicts labels accurately. It is useful only if its signal is available when you trade, on enough names, at the horizon that matters.
Long-Document Encoding for Filings and Transcripts
For long financial documents, the first design decision is not the model. It is how much context you can afford to preserve without mixing together information that arrives or matters at different times.
When Long-Context Encoders Are Worth the Cost
The decision between chunking and full-context encoding is a cost-accuracy tradeoff governed by document structure and task type -- principles that outlast any specific architecture generation.
11
The ML Pipeline
2 primers
Classical Statistical Tests as Linear Models: OLS, t-Tests, ANOVA, and Correlation
Many "different" statistical tests are the same linear-model object wearing different notation. Once you see the shared design-matrix view, the jump from classical inference to predictive regularization is much less mysterious.
Loss Functions, Error Metrics, and What They Hide
A model is trained to optimize one quantity, selected on another, and traded on a third. Most confusion in predictive modeling starts when those three layers are blurred together.
12
Advanced Models for Tabular Data
2 primers
Bayesian Hyperparameter Optimization Under Temporal Dependence
Hyperparameter search is part of the statistical design, not a software convenience layer.
Leakage-Safe Categorical Encoding for Financial ML
Categorical encoding becomes dangerous when a feature value quietly contains information from the target you are trying to predict.
13
Deep Learning for Time Series
3 primers
Making Transformers Time-Aware
A vanilla Transformer is good at flexible token interaction. Time-series forecasting needs more than that: it needs temporal and structural inductive bias.
State Space Models: From Kalman Intuition to Mamba
State space models compress the past into a latent state that is updated recursively, turning long-context sequence processing from a quadratic attention problem into a controlled linear dynamical system — and selective variants like Mamba let the model decide which inputs deserve to update that memory and which should be forgotten.
Uncertainty Estimation and Calibration for Deep Time-Series Models
A forecasting model is not uncertainty-aware because it emits a variance. It is uncertainty-aware only if that variance tracks future error under the validation protocol you actually trade.
14
Latent Factor Models
4 primers
CAPM, APT, and Fama-French: From Beta to Multifactor Pricing
Asset-pricing models all ask the same question: which systematic risks deserve expected return? CAPM gives one answer, APT opens the door to many, and Fama-French turns that logic into an empirical benchmark family.
Inelastic Markets Hypothesis and Flow-Driven Prices
If demand curves for risky assets slope downward rather than staying flat, flows can move prices in persistent ways. That turns "who has to trade?" into part of the asset-pricing problem.
Random Matrix Theory for PCA in Finance
PCA always returns components. The question is whether those components reflect latent economic structure or the noise geometry of a high-dimensional covariance estimate. Random matrix theory provides the benchmark for answering that question.
Stochastic Discount Factors, No-Arbitrage Moments, and HJ Distance
A stochastic discount factor is the object that prices everything at once. If it fails, the failure shows up as a portfolio the model misprices.
15
Causal Machine Learning
1 primer
16
Strategy Simulation
3 primers
Sharpe Ratio Under Autocorrelation and Non-Normal Returns
The Sharpe ratio is only easy to annualize and compare when returns behave far more cleanly than trading strategies usually do.
The Sharpe Ratio
The Sharpe ratio is the default language for comparing risk-adjusted performance, and most practitioners use it without understanding how noisy, fragile, and assumption-laden it really is.
White's Reality Check and Bootstrap Inference for Strategy Families
White's Reality Check asks a family-level question: after searching across many variants, is there evidence that any strategy truly beats the benchmark?
17
Portfolio Construction
4 primers
Benchmark-Relative Portfolio Evaluation: Tracking Error, Information Ratio, and Active Share
Once a benchmark exists, Sharpe ratio stops answering the whole question.
Covariance Shrinkage for Portfolio Allocation
Mean-variance portfolios fail less often when the covariance matrix is regularized before it is inverted.
Estimation Error and the Markowitz Curse
Mean-variance optimization is not fragile because the quadratic program is hard; it is fragile because the optimizer is asked to invert noisy beliefs about returns and covariances.
Kelly Criterion and Fractional Kelly for Multi-Asset Portfolios
Kelly sizing maximizes long-run log growth, but the full-Kelly solution is usually too fragile to estimated inputs to be deployed without a haircut.
18
Transaction Costs
2 primers
Almgren-Chriss Optimal Execution
Almgren-Chriss matters because execution is never just a cost problem. It is always a cost versus risk problem.
Square-Root Market Impact and Participation-Based Cost Models
The square-root rule matters because market impact grows slower than linearly with size, but still fast enough to kill many strategies.
19
Risk Management
3 primers
Drift Detection and Trigger Design
A risk system that cannot detect when its own inputs have shifted is a system waiting to be surprised -- and the hardest part is not detecting drift but deciding what to do about it.
Stress Testing and Reverse Stress Testing for Systematic Portfolios
Forward stress testing asks "how bad does it get in this scenario?" Reverse stress testing asks "what scenario breaks us?" -- and the second question is usually more useful for a systematic portfolio.
Volatility Forecasting for Risk Control: EWMA, GARCH, QLIKE, and Proxy-Robust Evaluation
Returns are hard to forecast. Risk is not easy either, but volatility is one of the few market objects that is forecastable enough to run real controls on.
20
Strategy Synthesis
2 primers
From Model Scores to Portfolio Weights
Portfolio construction is the decision rule that maps model outputs, risk estimates, current holdings, and constraints into target weights — and every design choice in that mapping can amplify, dampen, or invert the signal's intended direction.
Instrument-Appropriate Transaction Cost Models
A single basis-point cost assumption applied uniformly across asset classes will either kill viable strategies or greenlight doomed ones -- cross-asset cost synthesis requires instrument-specific models.
21
Reinforcement Learning
2 primers
Distributional RL and Risk Measures
Distributional RL learns the full return distribution rather than its mean, enabling risk-sensitive policies that align with how execution and hedging desks actually measure performance.
Policy Gradient Theorem and Actor-Critic Architectures
Policy gradient methods optimize parameterized policies directly, enabling the continuous action spaces and stochastic behaviors that execution and hedging demand.
22
RAG for Financial Research
1 primer
23
Knowledge Graphs
2 primers
Graph Centrality Measures for Financial Risk and Feature Engineering
Degree, betweenness, and eigenvector centrality quantify structural importance in financial networks and serve as risk indicators and ML features that price-based data alone cannot provide.
Statistical Financial Networks and Filtered Correlation Graphs
The Mantegna pipeline converts a noisy correlation matrix into a distance metric and extracts a minimum spanning tree that reveals market structure, sector relationships, and crisis dynamics that sector labels alone do not capture.
24
Autonomous Agents
1 primer
25
Live Trading Systems
1 primer