Mark B. Garman and Michael J. Klass (1980) — The Journal of Business · 1472 citations
Introduces a set of volatility estimators using Open, High, Low, and Close prices that are up to 8 times more efficient than standard close-to-close variance calculations.
Robert F. Engle (1983) — Journal of Money, Credit and Banking · 503 citations
This seminal paper introduces the Autoregressive Conditional Heteroscedasticity (ARCH) model to estimate the time-varying conditional variance of U.S. inflation, revealing that high inflation does not necessarily imply high unpredictability.
Tim Bollerslev (1986) — Journal of Econometrics · 23212 citations
This paper introduces the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, a significant extension of the ARCH model that allows for more flexible and parsimonious modeling of time-varying volatility by incorporating past conditional variances.
Robert F. Engle and C. W. J. Granger (1987) — Econometrica · 31736 citations
Engle and Granger (1987) formalize cointegration and prove that cointegrated I(1) variables must admit an error-correction representation, then provide practical two-step estimation and simulation-based cointegration tests with empirical macro/finance examples.
James D. Hamilton (1989) — Econometrica · 9717 citations
Hamilton (1989) introduces a maximum-likelihood Markov-switching autoregressive framework and nonlinear filter to infer unobserved regime changes, and shows U.S. real GNP growth is well-described by recurrent expansion/recession regimes with recessions implying an ~3% permanent level loss.
Søren Johansen and Katarina Juselius (1990) — Oxford Bulletin of Economics and Statistics · 11542 citations
This paper presents a maximum likelihood approach for estimating and testing cointegration relationships in vector autoregressive (VAR) models, with a focus on linear restrictions on cointegration vectors and weights, and illustrates the method using money demand data from Denmark and Finland.
Daniel B. Nelson (1991) — Econometrica · 10571 citations
This paper introduces Exponential GARCH (EGARCH) to address limitations of standard GARCH models, such as the inability to capture the negative correlation between returns and volatility, restrictive parameter constraints, and difficulties in interpreting volatility persistence.
Dennis Yang and Qiang Zhang (2000) — The Journal of Business · 470 citations
The authors introduce the 'Yang-Zhang' volatility estimator, which uses OHLC data to provide a minimum-variance estimate that is robust to both price trends (drift) and overnight gaps (opening jumps).
International Asset Allocation With Regime Shifts
Andrew Ang and Geert Bekaert (2002) — Review of Financial Studies · 1567 citations
Despite correlations rising in bear markets, international diversification remains economically valuable, particularly when investors can switch into cash (risk-free assets) during high-volatility regimes.
This paper introduces the Heterogeneous Autoregressive model of Realized Volatility (HAR-RV), a simple additive cascade model using volatility components across different time horizons, which effectively replicates key empirical features of financial returns like long memory and fat tails, while also demonstrating strong forecasting performance.
Matthew D. Hoffman and Andrew Gelman (2011) — arXiv:1111.4246 [cs, stat] · 4962 citations
This paper introduces NUTS, an extension of Hamiltonian Monte Carlo that automatically chooses trajectory length (and adaptively tunes step size), delivering HMC-level efficiency without hand-tuning.
Andrew Ang and Allan Timmermann (2011) · 454 citations
This paper reviews how regime-switching models (HMMs) capture the abrupt, persistent changes in financial data (volatility clustering, skewness) that linear models miss, demonstrating that optimal portfolios must dynamically adjust to 'bull' and 'bear' states.
This paper demonstrates that log-volatility behaves like a fractional Brownian motion with a Hurst exponent around 0.1, leading to the Rough FSV model, which aligns well with financial data and improves volatility forecasting.
Alan Moreira and Tyler Muir (2017) — The Journal of Finance · 381 citations
Scaling factor exposures each month by the inverse of last month’s realized variance produces large alphas and materially higher Sharpe ratios across many factors because volatility forecasts risk much more than it forecasts expected returns.
Advances in Financial Machine Learning
Marcos Lopez de Prado (2018) — John Wiley & Sons · 106 citations
Michael Betancourt (2018) — arXiv:1701.02434 [stat] · 1393 citations
This tutorial explains Hamiltonian Monte Carlo (HMC) through the geometry of the “typical set,” showing why gradient-informed, energy-conserving trajectories can explore high-dimensional posteriors far more efficiently—and how tuning/diagnostics (mass matrix, step size, trajectory length, divergences) make or break performance.
This paper introduces the Wasserstein k-means (WK-means) algorithm, a robust, non-parametric method for clustering financial time series into distinct market regimes by treating segments as probability distributions and using the p-Wasserstein distance, outperforming traditional moment-based k-means and HMMs, especially for non-Gaussian data.
A. Sinem Uysal and John M. Mulvey (2021) — The Journal of Financial Data Science · 20 citations
The paper uses supervised ML (especially random forests) to predict recessions and equity “crash” regimes from macro data and then uses these probabilities to improve risk parity portfolios via regime-aware covariance estimation and overlay trades.
Stephen Marra (2023) — The Journal of Portfolio Management
A practitioner-focused survey comparing common volatility forecasting models (historical, ARMA/GARCH, and option-implied) and showing why relatively simple, well-designed historical models can be robust inputs for volatility targeting and risk-parity allocation.
Yizhan Shu and John M. Mulvey (2025) — The Journal of Portfolio Management · 2 citations
A dynamic factor allocation strategy using Sparse Jump Models (SJM) to identify active return regimes improves the Information Ratio from 0.05 to ~0.45 compared to an equal-weighted benchmark.
This paper introduces Conformal Prediction for Time-series with Change points (CPTC), a novel algorithm that integrates a model to predict underlying states with online conformal prediction to provide uncertainty quantification for time series data with change points, demonstrating improved validity and adaptivity compared to state-of-the-art baselines.
This paper introduces the signature method, a way to transform time-ordered data into a set of features using iterated integrals, and discusses its theoretical properties and machine learning applications, including handwritten digit classification.
Michael Parkinson — The Journal of Business · 1938 citations
This paper introduces the extreme value method for estimating the variance of the rate of return of a common stock, demonstrating its superior efficiency compared to the traditional method using closing prices.