dplyr is a new R package for data manipulation. Using a series of examples on a dataset you can download, this tutorial covers the five basic dplyr "verbs" as well as a dozen other dplyr functions.
Watch the follow-up tutorial: http://youtu.be/2mh1PqfsXVI
View the R Markdown document: http://rpubs.com/justmarkham/dplyr-tutorial
Download the source document: https://github.com/justmarkham/dplyr-tutorial
Read about why I love dplyr: https://www.dataschool.io/dplyr-tutorial-for-faster-data-manipulation-in-r/
Tutorial contents:
1. Introduction to dplyr (starts at 0:00)
2. Loading dplyr and the example dataset (starts at 2:29)
3. Understanding "local data frames" (starts at 3:23)
4. Verb #1: `filter` (starts at 5:17)
5. Verb #2: `select`, plus `contains`, `starts_with`, `ends_with`, `matches` (starts at 7:54)
6. Using chaining syntax for more readable code (starts at 9:34)
7. Verb #3: `arrange` (starts at 12:53)
8. Verb #4: `mutate` (starts at 13:55)
9. Verb #5: `summarise`, plus `group_by`, `summarise_each`, `n`, `n_distinct`, `tally` (starts at 15:31)
10. Window functions: `min_rank`, `top_n`, `lag` (starts at 26:47)
11. Convenience functions: `sample_n`, `sample_frac`, `glimpse` (starts at 32:44)
12. Connecting to databases (starts at 34:21)
== RESOURCES ==
Reference manual and vignettes: http://cran.r-project.org/web/packages/dplyr/index.html
July 2014 webinar: http://pages.rstudio.net/Webinar-Series-Recording-Essential-Tools-for-R.html
July 2014 webinar code: https://github.com/rstudio/webinars/tree/master/2014-01
Tutorial by Hadley Wickham: https://www.dropbox.com/sh/i8qnluwmuieicxc/AAAgt9tIKoIm7WZKIyK25lh6a
GitHub repo: https://github.com/hadley/dplyr
List of releases: https://github.com/hadley/dplyr/releases
== LET'S CONNECT! ==
Newsletter: https://www.dataschool.io/subscribe/
Twitter: https://twitter.com/justmarkham
Facebook: https://www.facebook.com/DataScienceSchool/
LinkedIn: https://www.linkedin.com/in/justmarkham/