Four options for handling missing values (NaNs):
1. Drop rows containing NaNs
2. Drop columns containing NaNs
3. Fill NaNs with imputed values
4. Use a model that natively handles NaNs (NEW!)
Note: Beginning in scikit-learn 1.0, HistGradientBoostingClassifier and HistGradientBoostingRegressor will be considered stable (rather than experimental), and thus you will no longer have to enable them explicitly.
New tips every TUESDAY and THURSDAY!
Watch all tips: https://www.youtube.com/playlist?list=PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6
Code for all tips: https://github.com/justmarkham/scikit-learn-tips
Get tips via email: https://scikit-learn.tips
=== WANT TO GET BETTER AT MACHINE LEARNING? ===
1) LEARN THE FUNDAMENTALS in my intro course (free!): https://courses.dataschool.io/introduction-to-machine-learning-with-scikit-learn
2) BUILD YOUR ML CONFIDENCE in my intermediate course: https://courses.dataschool.io/building-an-effective-machine-learning-workflow-with-scikit-learn
3) LET'S CONNECT!
- Newsletter: https://www.dataschool.io/subscribe/
- Twitter: https://twitter.com/justmarkham
- Facebook: https://www.facebook.com/DataScienceSchool/
- LinkedIn: https://www.linkedin.com/in/justmarkham/