Chapter 5: Synthetic Financial Data

Bootstrap Methods for Dependent Financial Time Series

Bootstrap paths are useful only if they preserve the dependence structure your downstream metric actually cares about.

Bootstrap Methods for Dependent Financial Time Series

Bootstrap paths are useful only if they preserve the dependence structure your downstream metric actually cares about.

The Intuition

Bootstrap methods are the most interpretable synthetic-data baselines in Chapter 5 because they do not invent a new generative model. They simply recombine history. That is also why they fail in a predictable way: they can preserve only the structures that already existed in the observed sample.

An IID bootstrap resamples returns one day at a time. That preserves the one-period return distribution reasonably well, including skew and fat tails, but it destroys ordering. Calm days and volatile days are shuffled together, so volatility clustering disappears.

Block methods repair that by resampling contiguous stretches of the return series. If dependence matters over horizons of several days, a block can carry it into the resampled path. But that preservation is only local: dependence that does not fit inside sampled blocks, or regimes that never appeared in sample, will not be recreated automatically. The main trade-off is simple:

  • short blocks give more path variety but preserve less dependence
  • long blocks preserve more dependence but replay history more literally

The Core Methods

Let the observed return series be $r_1,\dots,r_T$.

IID bootstrap

Draw indices $i_1,\dots,i_T$ independently with replacement from ${1,\dots,T}$ and define

$$ r_t^* = r_{i_t}. $$

This preserves the marginal distribution of returns but wipes out serial dependence.

Moving block bootstrap

Choose a block length L and form overlapping blocks

$$ (r_1,\dots,r_L), (r_2,\dots,r_{L+1}), \dots, (r_{T-L+1},\dots,r_T). $$

Sample blocks with replacement and concatenate them until you reach length T, truncating the last block if needed.

This preserves within-block dependence, but the joins between sampled blocks are artificial. A volatility burst can be cut in half and stitched to an unrelated calm period.

Stationary bootstrap

The stationary bootstrap randomizes block lengths. At each resampled step, continue the current block with probability 1-p and restart with probability p. The expected block length is

$$ \mathbb{E}[L] = \frac{1}{p}. $$

Because restart decisions are memoryless, the resampling scheme is stationary rather than tied to a fixed within-block position. It removes the rigid cadence of fixed-length blocks, but it still creates artificial joins while preserving only local dependence.

What Should Be Preserved?

That depends on the evaluation task.

If you only need a null baseline for one-step return distributions, IID resampling may be good enough. If you care about path-dependent quantities such as drawdown, turnover, stop-outs, or volatility targeting, destroying serial dependence is a serious distortion.

In finance, the most common mistake is to look only at the autocorrelation of raw returns. Daily returns may have little linear autocorrelation while absolute or squared returns show strong dependence. That is volatility clustering, and it is often the structure the bootstrap must preserve.

A Diagnostic Example

Suppose an ETF return series has:

  • near-zero autocorrelation in raw returns
  • positive autocorrelation in squared returns out to roughly 10 lags

Now compare three synthetic paths:

  1. IID bootstrap
  2. moving block bootstrap with L=10
  3. stationary bootstrap with expected block length 10

All three paths will contain large moves because all three reuse observed returns. But their dependence diagnostics differ:

  • IID bootstrap usually drives the ACF of squared returns toward zero quickly.
  • Moving blocks preserve it better, but often introduce visible seam artifacts.
  • Stationary bootstrap usually preserves short-run dependence while looking less mechanically stitched.

That is why the right diagnostic is not just a histogram or QQ plot. For dependent data, compare at least:

  • ACF of raw returns
  • ACF of absolute or squared returns
  • drawdown depth and duration
  • volatility-burst persistence

Choosing Block Length

Block length is the main tuning parameter because it sets the horizon of dependence you hope to retain.

Too short:

  • clustered volatility is broken apart
  • path-dependent metrics become too mild
  • the resampled series starts to look closer to IID

Too long:

  • path variety collapses
  • the synthetic sample becomes a lightly shuffled replay of history
  • a few historical episodes dominate the resampled distribution

There is no universal optimal L. Automatic selectors can provide a useful starting point, but the practical approach is still diagnostic: choose a candidate block length, resample, and test whether the dependence structures relevant to your application survive.

In Practice

Two implementation rules are more important than they first appear.

First, resample returns, not prices. A price path is built by compounding returns. If you resample price levels directly, you create nonsensical jumps and lose the economic interpretation of the series.

For multi-asset panels, resample common time blocks across assets if you need to preserve cross-sectional covariance spikes as well as each series' serial dependence.

Second, remember that bootstrap expands sample size, not sample coverage. It can rearrange crises that happened. It cannot invent crisis dynamics, structural breaks, or market states that never appeared in the data.

That makes bootstrap a strong baseline but a weak stress-testing engine. It answers:

What if history had been realized in a different order, with local dependence partly preserved?

It does not answer:

What if the market enters a regime that never appeared in sample?

Common Mistakes

  • Using IID bootstrap for path-dependent evaluation and concluding a strategy is more robust than it really is.
  • Picking a block length mechanically without checking volatility-clustering diagnostics.
  • Looking only at marginal fit and ignoring path properties such as drawdown or burst persistence.
  • Treating bootstrap paths as if they were novel crisis scenarios.
  • Resampling prices directly instead of returns.

Connections

This primer supports Chapter 5's synthetic-data baselines and connects naturally to the stylized facts primer. Stylized facts tell you what realistic financial paths should look like; bootstrap methods tell you how much of that realism you can recover by recombining history alone.

Register to Read

Sign up for a free account to access all 61 primer articles.

Create Free Account

Already have an account? Sign in