Chapter 19: Risk Management

Volatility Forecasting for Risk Control: EWMA, GARCH, QLIKE, and Proxy-Robust Evaluation

Returns are hard to forecast. Risk is not easy either, but volatility is one of the few market objects that is forecastable enough to run real controls on.

Volatility Forecasting for Risk Control: EWMA, GARCH, QLIKE, and Proxy-Robust Evaluation

Returns are hard to forecast. Risk is not easy either, but volatility is one of the few market objects that is forecastable enough to run real controls on.

The Intuition

Daily returns often look close to unpredictable in sign. Volatility does not. High-volatility periods tend to be followed by high-volatility periods, and calm periods tend to be followed by calm ones. That persistence is what makes volatility forecasting operationally useful.

Chapter 19 relies on this in several places:

volatility targeting
regime-aware tail estimates
exposure caps that tighten in stress
stress-aware cost and leverage controls

But "volatility is forecastable" is only half the story. The other half is evaluation. You never observe true latent volatility. You observe noisy proxies: squared returns, realized variance, or range-based estimators. A volatility model can therefore look good or bad depending on which proxy you use unless the loss function is chosen carefully.

That is why this primer is less about inventing yet another volatility model and more about a practical question Chapter 19 depends on:

if you build two reasonable volatility forecasts, how do you score them when true latent volatility is unobserved?

Why Volatility Is Forecastable

The core stylized fact is volatility clustering. Large absolute returns tend to arrive near other large absolute returns, and small moves near small moves. In notation:

$$ \operatorname{Corr}(r_t, r_{t-k}) \approx 0, \qquad \operatorname{Corr}(r_t^2, r_{t-k}^2) > 0. $$

So the return itself may be weakly predictable, but the conditional second moment is not. That is exactly what a risk-control layer needs. It does not need to know tomorrow's sign with certainty. It needs a reasonable estimate of how violent tomorrow may be.

EWMA

The exponential weighted moving average updates variance as

$$ \hat{\sigma}_t^2 = \lambda \hat{\sigma}_{t-1}^2 + (1-\lambda) r_{t-1}^2, $$

with $0 < \lambda < 1$.

EWMA is the minimal reactive forecast: simple, stable, and fast to update, but sometimes too jumpy when recent shocks are mostly noise.

GARCH(1,1)

GARCH(1,1) adds a mean-reverting anchor:

$$ \hat{\sigma}_t^2 = \omega + \alpha r_{t-1}^2 + \beta \hat{\sigma}_{t-1}^2. $$

Compared with EWMA, GARCH adds an explicit mean-reverting baseline. It still reacts to the latest shock, but it is less willing to abandon the underlying variance regime immediately.

For Chapter 19 purposes, that is enough machinery:

EWMA is often favored when short-horizon responsiveness matters most
GARCH is often favored when you want more explicit persistence and mean reversion

Now the real problem starts: how do you decide which forecast is better?

The Proxy Problem

Suppose you build both forecasts on the same return series. What do you compare them to?

You do not observe true latent volatility. You choose a proxy such as:

squared daily return
realized variance from intraday data
a range-based estimator

These proxies are noisy and not equivalent. A model that looks better against squared returns may look worse against realized variance. If the loss function is naive, model ranking can flip for reasons that reflect proxy noise rather than forecast quality.

That is why proxy-robust loss functions are not an appendix detail. They are part of the model selection problem itself.

QLIKE and Variance-MSE

Let $v_t^{proxy}$ denote a volatility proxy for variance. Two losses are especially useful.

Variance-MSE

$$ L_{\text{MSE}} = \left(\hat{\sigma}_t^2 - v_t^{proxy}\right)^2. $$

This is applied to variance, not volatility. That distinction matters. Volatility-MSE is not proxy-robust in the same way.

QLIKE

$$ L_{\text{QLIKE}} = \log(\hat{\sigma}_t^2) + \frac{v_t^{proxy}}{\hat{\sigma}_t^2}. $$

QLIKE is useful because the ratio term explodes as the forecast variance approaches zero, while the log term grows much more gently as the forecast gets too large. So underpredicting variance is penalized more severely than overpredicting by the same absolute amount. That asymmetry is not a special design choice for trading. It is a consequence of the same structure that makes QLIKE more stable across different consistent volatility proxies.

The practical lesson from Hansen-Lunde and Patton is not that every other loss is wrong. It is that QLIKE and variance-MSE are the safest default choices when the target is noisy and you want model rankings that are less sensitive to which proxy you happened to use.

Why Horizon Alignment Matters

A volatility forecast is only meaningful relative to a horizon.

If your control acts on next-day leverage, evaluate a next-day variance forecast. If you need a five-day control, build and score a five-day variance forecast. Under GARCH(1,1), that means working with a multi-step object such as

$$ \hat{\sigma}_{t+1:t+h}^2 = \sum_{j=1}^{h} \mathbb{E}_t[\sigma_{t+j}^2], $$

not comparing a one-step forecast to a five-day realized proxy after the fact.

For GARCH(1,1), those future conditional variances can be generated recursively, so this is a closed-form forecasting problem rather than a simulation requirement.

The same issue appears in implementation:

a lagged daily forecast can be used for next-day scaling
it should not be treated as an intraday risk signal unless the model was built for that horizon

This is one of the easiest ways to create false confidence in a risk overlay.

A Worked Comparison

Imagine two one-step volatility models on the same return series:

Model A: fast EWMA, highly reactive
Model B: slower GARCH, smoother baseline

On a sudden volatility spike:

EWMA may react faster and reduce near-term underprediction
GARCH may lag less in the subsequent decay phase because its baseline is less noisy

If you score both models against squared returns only, the ranking may reward whichever model chases the noisiest one-day moves more aggressively. If you score against realized variance only, the ranking may favor the smoother model.

This is exactly why a sensible evaluation report should show:

average QLIKE
average variance-MSE
horizon used
proxy used

and explain how these choices relate to the control problem being solved.

Why This Matters For Risk Control

Volatility forecasts are not decorative analytics. They feed decisions such as

$$ w_t = w^* \frac{\sigma_{target}}{\hat{\sigma}_{t-1}}, $$

where $w^*$ is a baseline exposure and $\hat{\sigma}_{t-1}$ is a lagged volatility estimate.

If $\hat{\sigma}$ is biased low, the portfolio scales up precisely when it should be de-risking. If the forecast is too sluggish, the control reacts after the shock. If it is too twitchy, the control turns into turnover.

This is why a volatility model should be judged by the downstream control it supports, not just by abstract fit.

In Practice

Use the evaluation setup that matches the control:

compare EWMA and GARCH at the forecast horizon you actually use
evaluate with QLIKE and variance-MSE against a clearly stated proxy rather than relying on a single ad hoc fit metric
if the model ranking changes when you switch from squared returns to realized variance, treat that as a warning that your scoring rule is proxy-sensitive
inspect the downstream sizing or leverage rule, because the real question is whether the forecast keeps the control stable when volatility jumps

Common Mistakes

Treating return unpredictability as evidence that volatility is unpredictable too.
Comparing forecasts against the wrong horizon or an undefined proxy.
Ranking models with volatility-MSE and treating the result as proxy-robust.
Choosing the most reactive model without checking whether it creates unstable sizing.
Evaluating the forecast in isolation rather than through the risk overlay it drives.

Connections

This primer supports Chapter 19's adaptive controls and tail-risk machinery. It connects directly to range-based volatility estimators from Chapter 8, model-based volatility features from Chapter 9, allocator sizing in Chapter 17, and the implementation discipline required for leakage-safe risk controls.

Register to Read

Create Free Account

Already have an account? Sign in

Chapter

19 Risk Management

More Primers

Drift Detection and Trigger Design Stress Testing and Reverse Stress Testing for Systematic Portfolios