Model Families¶
ml4t-models is organized around four families. The split is deliberate: each family answers a different question about a return panel, and forcing them through one common interface would obscure that.
This page is the canonical map. Use it as a lookup; the user guide develops each family in depth and the academic references page collects the papers behind each implementation.
The Four Families At A Glance¶
| Family | Models | Native object | Predictive object |
|---|---|---|---|
| Latent factors | PCAModel, RPPCAModel, IPCAModel, CAEModel |
exposures + factor history | \(\beta \cdot \lambda\) after a factor-premium forecast |
| Stochastic discount factor | StochasticDiscountFactorModel |
tradable weight vector + SDF series | optional return projection through a beta head |
| Direct asset prediction | SAEModel |
asset-level signal | the signal itself |
| End-to-end portfolio learning | LinearFeaturePortfolioModel, LSTMPortfolioModel, DeepPortfolioModel |
sequence-to-allocation tensor | target weights |
Two principles run through the table:
- The native object and the predictive object are different in three of four families. Latent-factor models extract structure first; SDF models learn weights first; portfolio models learn weights from the start. Forecasting and weight extraction are kept separate so that downstream code (forecasters, mappers, postprocessors) can be swapped without retouching the structural estimator.
- The contract follows the economics. PCA needs a stable entity axis; IPCA tolerates ragged dated cross-sections; portfolio learning needs sequence batches with cost and turnover state. The library represents these distinctions with three batch types (
PersistentPanelBatch,CrossSectionBatch,PortfolioSequenceBatch) rather than collapsing them into one tensor.
Latent Factors¶
| Class | Contract | Loading structure | Estimator |
|---|---|---|---|
PCAModel |
PersistentPanelBatch |
static loadings | SVD on demeaned panel |
RPPCAModel |
PersistentPanelBatch |
static loadings, risk-premium-aware extraction | eigendecomposition of \(\Sigma + \gamma\, \bar{r}\bar{r}^{\top}\) |
IPCAModel |
CrossSectionBatch |
linear \(\beta_{i,t} = \Gamma^{\top} z_{i,t}\) | alternating least squares |
CAEModel |
CrossSectionBatch |
nonlinear \(\beta_{i,t} = g(z_{i,t}; W)\) | dual-network neural training, ensemble averaging |
Predictive pattern, shared across the family:
The pattern is enforced by LatentFactorForecastPipeline, which composes a structural model, a factor-premium forecaster, and a BetaLambdaMapper into a single object exposing fit and predict.
The split is not cosmetic. For IPCA and CAE, the in-sample fitted return uses realized factor returns; the implementable forecast must replace those with an ex-ante factor-premium estimate. Conflating the two is the most common chapter-to-production failure mode.
Stochastic Discount Factor¶
| Class | Contract | Estimator |
|---|---|---|
StochasticDiscountFactorModel |
CrossSectionBatch |
three-phase adversarial training (unconditional / moment / conditional) |
The native object is a portfolio-weight vector \(\hat\omega_{t,i}\) together with its induced SDF series \(\hat M_{t+1}\), not a \(\beta \cdot \lambda\) decomposition. Expected-return-style projections are handled by optional helpers:
LinearStochasticDiscountFactorReturnMapperStochasticDiscountFactorBetaNetworkHead
These are downstream transformations. They do not change the fact that the structural estimator is weight-native.
The model is phase-aware (n_epochs_unc, n_epochs_moment, n_epochs_cond) and checkpointed (checkpoint_epochs, default_checkpoint) so that long training horizons can be evaluated at sparse, named milestones rather than at dense intervals.
Direct Asset Prediction¶
| Class | Contract | Architecture |
|---|---|---|
SAEModel |
CrossSectionBatch |
encoder, decoder, auxiliary head, main predictive head |
SAEModel is the supervised autoencoder of the Jane Street competition lineage. It is treated as a direct cross-sectional predictor: predict(batch) returns AssetSignalResult, not AssetForecastResult. The model lives outside latent_factors because it does not factorize returns into \(\beta \cdot \lambda\), even though it shares the bottleneck-and-decoder shape with autoencoder factor models.
End-to-End Portfolio Learning¶
| Class | Contract | Style |
|---|---|---|
LinearFeaturePortfolioModel |
PortfolioSequenceBatch |
deterministic pooled-linear baseline |
LSTMPortfolioModel |
PortfolioSequenceBatch |
sequence baseline with variable selection and bounded head |
DeepPortfolioModel |
PortfolioSequenceBatch |
DeePM-style allocator with temporal and cross-sectional attention |
All three optimize a portfolio objective directly. The default training loop is differentiable Sharpe with turnover and cost penalties; it is not a "predict returns, then optimize" two-stage workflow. That is the design contract — if you want a forecast first, use the latent-factor pipeline.
Forecasters¶
Factor-premium forecasters consume LatentFactorState and produce FactorForecastResult:
| Class | Idea |
|---|---|
ExpandingMeanFactorForecaster |
sample-mean baseline; the standard premium estimate |
AR1FactorForecaster |
per-factor AR(1) with shrinkage to mean |
EWMABaseFactorForecaster |
EWMA with configurable half-life |
The expanding-mean forecaster is the default for two reasons: it requires no tuning, and it makes the failure modes of a richer forecaster easier to diagnose. Switch to AR(1) or EWMA only when there is direct evidence of premium time-variation in the in-sample factor returns.
Mappers¶
| Class | Idea |
|---|---|
BetaLambdaMapper |
computes \(\hat r_{i,t+1} = \beta_{i,t} \cdot \hat\lambda_{t+1}\) from the structural state and the factor-premium forecast |
This is the only mapper in v0.1. Anything more elaborate (state-conditional premia, regime-switching mappers) is roadmap.
Configs¶
Every estimator takes a frozen dataclass config. The conventions are:
model_name— used in metadata and run logs.n_factors— for latent-factor models.checkpoint_*— for neural models, controls which epochs are kept.default_checkpoint— the epoch returned byextract/predictwhen called without an explicitcheckpoint=argument.
Configs are immutable; instantiate a new one to change any field.
Integration¶
Boundary helpers live in ml4t.models.integration and are documented in the Integration user guide. The role of integration code is to translate model results into the long-format frames ml4t-backtest and ml4t-diagnostic consume, and to translate long-format frames into the typed batches each model expects. It does not perform evaluation, signal aggregation, or execution simulation.