Chapter 17: Portfolio Construction

Kelly Criterion and Fractional Kelly for Multi-Asset Portfolios

Kelly sizing maximizes long-run log growth, but the full-Kelly solution is usually too fragile to estimated inputs to be deployed without a haircut.

Kelly Criterion and Fractional Kelly for Multi-Asset Portfolios

Kelly sizing maximizes long-run log growth, but the full-Kelly solution is usually too fragile to estimated inputs to be deployed without a haircut.

The Intuition

Kelly starts from a different objective than mean-variance optimization. Instead of maximizing a single-period trade-off between expected return and variance, it maximizes expected growth of wealth over repeated bets.

That difference matters because repeated compounding punishes overbetting asymmetrically. If you size too small, you grow more slowly. If you size too large, drawdowns can become so deep that the strategy never recovers.

This is why Kelly is attractive and dangerous at the same time. It provides a principled sizing target, but it is highly sensitive to the quality of the estimated edge.

The Math

Scalar Kelly

Suppose a fraction f of wealth is invested in a risky opportunity with excess return r. Kelly chooses f to maximize expected log wealth:

$$ \max_f \; \mathbb{E}[\log(1 + fr)]. $$

For small returns, a second-order expansion gives

$$ \mathbb{E}[\log(1 + fr)] \approx f\mu - \frac{1}{2} f^2 \sigma^2, $$

where mu = E[r] and $sigma^2 = Var(r)$. Differentiating yields

$$ f^* \approx \frac{\mu}{\sigma^2}. $$

That is the scalar Kelly fraction.

Multi-asset Kelly

For a vector of excess returns r and portfolio weights w, the same quadratic approximation gives

$$ \max_w \; w^T \mu - \frac{1}{2} w^T \Sigma w, $$

so the unconstrained Kelly solution is

$$ w_K = \Sigma^{-1}\mu. $$

That should look familiar. Up to scaling and constraints, Kelly and mean-variance optimization point in the same direction. Kelly is therefore not an alien portfolio rule. It is a growth-optimal interpretation of the same risk-return geometry.

Why Full Kelly Is Rarely Deployable

The formula $w_K = Sigma^{-1} mu$ inherits the same weaknesses as tangency-style optimization.

mu is noisy
Sigma is noisy
inversion magnifies the noise

So the full-Kelly portfolio can imply leverage and concentration that look mathematically consistent and operationally absurd. The problem is not with log utility. The problem is that the optimizer is treating estimated edge as known edge.

This is why Kelly should usually be read as an upper bound on aggressiveness, not as a default deployment weight.

Fractional Kelly

Fractional Kelly scales the full-Kelly solution by a factor c in (0,1):

$$ w_{cK} = c \, w_K. $$

Common choices are half Kelly or quarter Kelly. This is not an arbitrary hack. It directly reduces the damage from estimation error and from model misspecification.

The intuition is simple:

full Kelly maximizes theoretical asymptotic growth under correct inputs
fractional Kelly sacrifices some idealized growth to gain much better drawdown behavior and robustness to error

In practice, that trade-off is usually worth it.

A Practical Comparison

Suppose a multi-asset model estimates annualized excess return $mu_p = 8%$ and volatility $sigma_p = 10%$ for a portfolio sleeve. The scalar approximation implies

$$ f^* \approx \frac{0.08}{0.10^2} = 8. $$

That means 8x exposure to the sleeve. Mathematically coherent; operationally extreme.

Now suppose the true expected excess return is 5% rather than 8% because the alpha estimate was optimistic. The full-Kelly allocation was based on a false edge and will overbet precisely where the compounding penalty is worst.

Half Kelly and quarter Kelly still reflect the positive edge, but they are less likely to convert forecast error into intolerable drawdowns. That is why practitioners often say that full Kelly is a research concept and fractional Kelly is the investable version.

The Sharpe-Ratio Connection

Kelly also explains why high-Sharpe strategies attract leverage. In the scalar approximation,

$$ f^* \approx \frac{\mu}{\sigma^2} = \frac{\mu / \sigma}{\sigma} = \frac{\text{Sharpe}}{\sigma}. $$

So for a fixed volatility scale, higher Sharpe implies a larger Kelly fraction. But that identity is also a warning: the Sharpe and volatility must be measured on the same horizon, and both are estimated with error. A strategy that looks like Sharpe 2 in sample may not deserve Kelly-sized leverage once serial dependence, costs, or regime change are admitted.

In Practice

Kelly is most useful as a sizing language.

It helps answer:

how aggressive is this portfolio relative to its estimated edge?
what leverage is implied by my beliefs?
how much haircut should I apply for estimation error and drawdown tolerance?

In real portfolio construction, Kelly should usually be combined with:

shrinkage or other regularization of mu and Sigma
gross and net exposure constraints
turnover and liquidity limits
explicit drawdown tolerance

This is why Chapter 17 links Kelly back to MVO and then moves quickly to fractional Kelly. The pure solution is informative, but the deployable solution acknowledges that inputs are uncertain and capital is finite.

Common Mistakes

Treating full Kelly as a default allocation rather than as a growth-optimal upper bound under perfect inputs.
Forgetting that the multi-asset solution inherits all the estimation-error problems of $Sigma^{-1} mu$.
Comparing Kelly fractions computed on one timescale with Sharpe or volatility measured on another.
Ignoring path risk: a theoretically growth-optimal strategy can still be operationally unacceptable because of drawdowns.
Calling a portfolio "Kelly" when it is really just leveraged without any explicit log-growth interpretation.

Connections

This primer connects Chapter 17's allocation rules to Chapter 19's drawdown and risk-control concerns. It also sits directly on top of the estimation-error primer: full Kelly is fragile for the same reason naive tangency portfolios are fragile. Fractional Kelly is the regularized version of a good idea.

Register to Read

Create Free Account

Already have an account? Sign in

Chapter

17 Portfolio Construction

More Primers

Benchmark-Relative Portfolio Evaluation: Tracking Error, Information Ratio, and Active Share Covariance Shrinkage for Portfolio Allocation Estimation Error and the Markowitz Curse