Reasons to use scikit-learn (not pandas) for ML preprocessing:
1. You can cross-validate the entire workflow
2. You can grid search model & preprocessing hyperparameters
3. Avoids adding new columns to the source DataFrame
4. pandas lacks separate fit/transform steps to prevent data leakage

New tips every TUESDAY and THURSDAY!

Watch all tips: https://www.youtube.com/playlist?list=PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6
Code for all tips: https://github.com/justmarkham/scikit-learn-tips
Get tips via email: https://scikit-learn.tips


=== WANT TO GET BETTER AT MACHINE LEARNING? ===

1) WATCH my video series: https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A

2) ENROLL in my courses: https://www.dataschool.io/ml-courses/

3) LET'S CONNECT!
- Newsletter: https://www.dataschool.io/subscribe/
- Twitter: https://twitter.com/justmarkham
- Facebook: https://www.facebook.com/DataScienceSchool/
- LinkedIn: https://www.linkedin.com/in/justmarkham/