ML4T Diagnostic
ML4T Diagnostic Documentation
Feature validation, strategy diagnostics, and Deflated Sharpe Ratio
Skip to content

Feature Diagnostics

Analyze feature quality, importance, and interactions before modeling.

See it in the book

Ch08 code/08_feature_engineering/06_robustness_sensitivity.py shows HAC-aware robustness checks, while Ch09 code/09_model_based_features/01_visual_diagnostics.py, 07_arima_features.py, 08_garch_volatility.py, and 12_wasserstein_regimes.py connect the diagnostic primitives here to model-based features and regime analysis. Case-study evaluation notebooks under code/case_studies/*/05_evaluation.py apply the same ideas to real cross-sectional datasets.

Quick Start

from ml4t.diagnostic.evaluation import FeatureDiagnostics
from ml4t.diagnostic.config import DiagnosticConfig

config = DiagnosticConfig.for_research()
fd = FeatureDiagnostics(config=config)
result = fd.run_diagnostics(features_df["feature_1"], name="feature_1")

Information Coefficient (IC)

Measure predictive power via rank correlation:

from ml4t.diagnostic.evaluation.metrics import compute_ic_series

ic_result = compute_ic_series(
    predictions=pred_df,          # date, symbol, prediction
    returns=ret_df,               # date, symbol, forward_return
    pred_col="prediction",
    ret_col="forward_return",
    date_col="date",
    entity_col="symbol",
    method="spearman",
)

print(ic_result.head())

IC Metrics

Metric Formula Interpretation
IC Mean mean(IC) Average predictive power
IC Std std(IC) Consistency
IC IR mean/std Risk-adjusted IC
IC t-stat mean / (std/√n) Statistical significance

Feature Importance

Seven methods with consensus ranking:

Mean Decrease Impurity (MDI)

from ml4t.diagnostic.evaluation.metrics import compute_mdi_importance

importance = compute_mdi_importance(
    model=trained_tree_model,
    feature_names=feature_names
)

Permutation Feature Importance (PFI)

from ml4t.diagnostic.evaluation.metrics import compute_permutation_importance

importance = compute_permutation_importance(
    model=model,
    X=X_test,
    y=y_test,
    n_repeats=10
)

SHAP Importance

from ml4t.diagnostic.evaluation.metrics import compute_shap_importance

importance = compute_shap_importance(
    model=model,
    X=X_background,
    n_samples=100
)

Consensus Ranking

Run a combined tear-sheet style comparison:

from ml4t.diagnostic.evaluation.metrics import analyze_ml_importance

analysis = analyze_ml_importance(
    model=model,
    X=X_train,
    y=y_train,
    methods=["mdi", "pfi", "shap"],
)

print(analysis["top_features_consensus"])

Feature Interactions

Detect non-linear interactions using H-statistic:

from ml4t.diagnostic.evaluation.metrics import compute_h_statistic

h_stat = compute_h_statistic(
    model=model,
    X=X,
    features=['momentum', 'volatility']
)

print(f"Interaction strength: {h_stat:.3f}")
# > 0.1 indicates meaningful interaction

Stationarity Tests

Ensure features are stationary:

from ml4t.diagnostic.evaluation.stationarity import analyze_stationarity

result = analyze_stationarity(
    series=feature_series,
    tests=['adf', 'kpss', 'pp']
)

print(f"ADF p-value: {result.adf_pvalue:.4f}")
print(f"Is stationary: {result.is_stationary}")

Distribution Analysis

Check for heavy tails and normality:

from ml4t.diagnostic.evaluation.distribution import analyze_distribution

result = analyze_distribution(feature_series)

print(f"Skewness: {result.moments_result.skewness:.2f}")
print(f"Excess Kurtosis: {result.moments_result.excess_kurtosis:.2f}")  # Fisher convention (normal=0)
print(f"Jarque-Bera p-value: {result.jarque_bera_result.p_value:.4f}")
print(f"Is normal: {result.is_normal}")
print(f"Recommendation: {result.recommended_distribution}")

Drift Detection

Monitor feature distribution changes:

from ml4t.diagnostic.evaluation.drift import analyze_drift

result = analyze_drift(
    train_features=X_train,
    test_features=X_test
)

print(f"PSI: {result.psi:.4f}")
# > 0.25 indicates significant drift

Complete Workflow

from ml4t.diagnostic.evaluation import FeatureDiagnostics

fd = FeatureDiagnostics()
result = fd.run_diagnostics(features_df["feature_1"], name="feature_1")

# Review all diagnostics
print(result.summary())

# Get warnings
for warning in result.warnings:
    print(f"⚠️ {warning}")

# Export report
result.to_html("feature_diagnostics.html")