ML4T Models¶
Build finance-native latent-factor, stochastic discount factor, direct signal, and portfolio-learning models without collapsing everything into one generic trainer.
ml4t.models is the modeling layer in the ML4T stack. It packages model families that matter in empirical asset pricing and portfolio construction while keeping the contracts explicit:
- what kind of data each model expects
- what object it estimates
- what must still happen before you have an implementable forecast or tradable weight vector
If you are new to the library, start with the Quickstart. If you are coming from Machine Learning for Trading, the Book Guide maps the chapter implementations to the production API.
-
Latent Factors Done Explicitly
Structural extraction, factor-premium forecasting, and asset mapping are separate stages. This keeps PCA, RP-PCA, IPCA, and CAE conceptually clean. Latent-Factor Pipelines
-
No-Arbitrage SDF Modeling
The stochastic discount factor family is weight-native and phase-aware. It is not forced into the same
beta × lambdacontract as latent-factor models. Stochastic Discount Factor -
End-To-End Portfolio Learning
Learn allocations directly with deterministic, LSTM, and DeePM-style portfolio models. Keep allocation objectives separate from return forecasting logic. Portfolio Learning
-
Built For The ML4T Stack
Emit prediction and weight frames for
ml4t-backtestandml4t-diagnosticwithout duplicating evaluation logic inside the model library. Integration
Architecture At A Glance¶
Why This Library Exists¶
Many finance models look similar at the tensor level but behave very differently conceptually:
PCAandRP-PCAestimate persistent-panel latent factorsIPCAandCAEestimate conditional exposures from dated cross-sectionsStochasticDiscountFactorModellearns a no-arbitrage pricing object through weight-native trainingSAEModelis a direct supervised predictor- portfolio models learn allocations directly rather than first forecasting returns
The library reflects those differences instead of hiding them behind one catch-all fit/predict story.
Quick Example¶
import numpy as np
from ml4t.models import (
BetaLambdaMapper,
CrossSectionBatch,
ExpandingMeanFactorForecaster,
IPCAConfig,
IPCAModel,
LatentFactorForecastPipeline,
)
batch = CrossSectionBatch(
characteristics=np.random.randn(24, 150, 10),
returns=np.random.randn(24, 150),
timestamps=tuple(range(24)),
)
pipeline = LatentFactorForecastPipeline(
model=IPCAModel(IPCAConfig(n_factors=3)),
forecaster=ExpandingMeanFactorForecaster(),
mapper=BetaLambdaMapper(),
)
pipeline.fit(batch)
prediction = pipeline.predict(batch)
print(prediction.state.asset_betas.shape)
print(prediction.asset_forecast.expected_returns.shape)
Three Core Contracts¶
| Contract | Used by | What it represents |
|---|---|---|
PersistentPanelBatch |
PCAModel, RPPCAModel |
stable-entity return panel |
CrossSectionBatch |
IPCAModel, CAEModel, SAEModel, StochasticDiscountFactorModel |
dated observed cross-sections, ragged by construction |
PortfolioSequenceBatch |
LinearFeaturePortfolioModel, LSTMPortfolioModel, DeepPortfolioModel |
sequence-to-allocation learning |
Model Families¶
latent_factors
├── PCAModel
├── RPPCAModel
├── IPCAModel
└── CAEModel
stochastic_discount_factor
└── StochasticDiscountFactorModel
asset_prediction
└── SAEModel
portfolio
├── LinearFeaturePortfolioModel
├── LSTMPortfolioModel
└── DeepPortfolioModel