Chapter 17: Portfolio Construction

Estimation Error and the Markowitz Curse

Mean-variance optimization is not fragile because the quadratic program is hard; it is fragile because the optimizer is asked to invert noisy beliefs about returns and covariances.

Estimation Error and the Markowitz Curse

Mean-variance optimization is not fragile because the quadratic program is hard; it is fragile because the optimizer is asked to invert noisy beliefs about returns and covariances.

The Intuition

Markowitz optimization looks clean on paper. Estimate expected returns mu, estimate a covariance matrix Sigma, solve for weights, and obtain the portfolio with the best trade-off between return and risk.

The catch is that the optimizer is most aggressive exactly where the inputs are least reliable.

If one asset's expected return estimate is a little too high, or one covariance entry is a little too low, the optimizer can respond with a large position change. In-sample this looks rational: weights move toward the apparently best risk-adjusted opportunity. Out-of-sample it often turns into concentration, instability, and turnover.

That is the Markowitz curse. Optimization does not average away estimation noise; it amplifies it.

The Math

For a fully invested mean-variance problem,

$$ \max_w \quad w^T \mu - \frac{\lambda}{2} w^T \Sigma w \quad\text{subject to}\quad \mathbf{1}^T w = 1. $$

Ignoring constraints for a moment, the optimizer has the closed-form shape

$$ w^* \propto \Sigma^{-1}\mu. $$

That formula exposes the problem immediately.

Errors in mu matter because weights are proportional to mu.
Errors in Sigma matter even more because they pass through $Sigma^{-1}$.

If Sigma has small unstable eigenvalues, inversion blows them up. The optimizer then chases directions that look low-risk in sample but are mostly artifacts of estimation error.

This is especially severe when the number of assets p is not small relative to the sample size T. The covariance matrix has p(p+1)/2 free parameters. A 100-asset universe requires 5,050 covariance terms. A year of daily data gives only about 252 observations. Even if each entry is estimated without obvious bias, the matrix as a whole is noisy enough that the optimizer becomes a noise amplifier.

Why Means Are Worse Than They Look

Expected returns are much harder to estimate than covariances. Covariances at least benefit from the fact that volatility is more stable than mean return. Expected return estimates are weak, noisy, and typically small relative to their standard errors.

That creates the classic failure mode of the tangency portfolio:

$$ w_{\text{tan}} \propto \Sigma^{-1}(\mu - r_f \mathbf{1}). $$

Tiny perturbations in the vector $mu - r_f 1$ can rotate the solution dramatically. The optimizer is being asked to decide among assets whose true Sharpe differences are often too small to estimate reliably. It responds with conviction anyway.

A Five-Asset Thought Experiment

Suppose five assets have similar volatilities and moderately correlated returns. The estimated excess returns are

$$ \mu = (4.0,\ 4.4,\ 4.2,\ 4.1,\ 4.3)\%. $$

Now imagine the estimate for the second asset is off by only 40 basis points because of sample noise:

$$ \tilde{\mu} = (4.0,\ 4.8,\ 4.2,\ 4.1,\ 4.3)\%. $$

That change is economically small and statistically plausible. But after multiplying by $Sigma^{-1}$, the optimizer may turn a moderate weight into the portfolio's dominant position while shorting nearby substitutes. The portfolio did not become better informed. It became more certain than the data justified.

Equal weight ignores that noisy ranking and therefore looks crude in sample. Out of sample, it often wins because it refuses to overreact to weak information.

Where the Curse Shows Up

The Markowitz curse is visible in downstream portfolio behavior:

extreme positive and negative weights
high sensitivity to rebalance date
unstable turnover
large changes in risk contribution from small input revisions

These are not separate problems. They are symptoms of the same underlying issue: input uncertainty is being mistaken for economic opportunity.

Why Shrinkage and Constraints Help

Every practical remedy works by reducing the optimizer's freedom to overfit noisy inputs.

Covariance shrinkage

Replace the raw sample covariance S with a regularized estimate:

$$ \hat{\Sigma} = (1-\alpha)S + \alpha F, $$

where F is a structured target. This stabilizes eigenvalues and makes inversion less explosive.

Return shrinkage

Treat expected returns as weak signals and shrink them toward zero, a factor prior, or a benchmark. This reduces the optimizer's tendency to take huge active bets on noisy alpha estimates.

Weight and leverage constraints

Constraints are not a hack. They encode the fact that the unconstrained optimum is often an artifact of estimation noise. Box constraints, gross exposure caps, and turnover penalties are forms of regularization in portfolio space.

Simpler allocators

Equal weight, inverse volatility, and risk parity often outperform naive tangency portfolios not because they are theoretically deeper, but because they demand less from noisy data.

In Practice

The right question is not "is MVO mathematically correct?" It is "how much signal survives after input uncertainty is admitted?"

A robust workflow is:

estimate inputs conservatively
regularize covariance and, if needed, expected returns
impose realistic constraints
evaluate portfolio stability, not just in-sample utility

Useful diagnostics include:

max weight and gross leverage
turnover under rolling re-estimation
sensitivity of weights to small perturbations in mu
out-of-sample realized volatility and diversification

If a portfolio changes drastically under small estimation changes, the problem is not that the optimizer is underpowered. It is that the optimizer is reading noise as structure.

Common Mistakes

Treating the unconstrained tangency portfolio as an economically meaningful benchmark.
Focusing on in-sample efficient frontiers rather than out-of-sample weight stability.
Blaming covariance estimation alone when expected-return error is often even more damaging.
Using more assets and more parameters as if that automatically improves diversification.
Treating constraints as arbitrary overrides instead of regularizers against estimation error.

Connections

This primer is the conceptual parent of covariance shrinkage, robust optimization, and fractional Kelly sizing. Chapter 17 uses those tools because naive MVO asks too much of the data. The same logic is a portfolio version of Chapter 11's regularization theme: a controlled amount of bias can produce a large gain when the unconstrained estimator is too noisy to trust.

Register to Read

Create Free Account

Already have an account? Sign in

Chapter

17 Portfolio Construction

More Primers

Benchmark-Relative Portfolio Evaluation: Tracking Error, Information Ratio, and Active Share Covariance Shrinkage for Portfolio Allocation Kelly Criterion and Fractional Kelly for Multi-Asset Portfolios