Although numeric data is easy to work with in Python, most knowledge created by humans is actually raw, unstructured text. By learning how to transform text into data that is usable by machine learning models, you drastically increase the amount of data that your models can learn from. In this tutorial, we'll build and evaluate predictive models from real-world text using scikit-learn. (Presented at PyCon on May 28, 2016.)
GitHub repository: https://github.com/justmarkham/pycon-2016-tutorial
Enroll in my online course: https://courses.dataschool.io/machine-learning-with-text-in-python
== OTHER RESOURCES ==
My scikit-learn video series: https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A
My pandas video series: https://www.youtube.com/playlist?list=PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y
== LET'S CONNECT! ==
Newsletter: https://www.dataschool.io/subscribe/
Twitter: https://twitter.com/justmarkham
Facebook: https://www.facebook.com/DataScienceSchool/
LinkedIn: https://www.linkedin.com/in/justmarkham/
YouTube: https://www.youtube.com/user/dataschool?sub_confirmation=1
JOIN the "Data School Insiders" community and receive exclusive rewards:
https://www.patreon.com/dataschool