Chapter 12: Advanced Models for Tabular Data

Leakage-Safe Categorical Encoding for Financial ML intermediate

Categorical encoding becomes dangerous when a feature value quietly contains information from the target you are trying to predict.

Categorical encoding becomes dangerous when a feature value quietly contains information from the target you are trying to predict.

Register to Read

Sign up for a free account to access all 112 primer topics.

Create Free Account

Already have an account? Sign in

References

Advances in Financial Machine Learning
Marcos Lopez de Prado (2018) — John Wiley & Sons
CatBoost: unbiased boosting with categorical features
Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, Andrey Gulin (2019) — arXiv:1706.09516 [cs]
LightGBM: A Highly Efficient Gradient Boosting Decision Tree
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu (2017) — Curran Associates, Inc.