Subscribe to DSC Newsletter

Top Downloaded & Most Discussed R Packages

Gest blog post.

Vozag downloaded CRAN data from the R project to understand the top projects & which ones had the most discussions. Given below is a list of the top 20 packages downloaded in a single day. The full list of the top 100 most downloaded R packages is here.

Rank

Package

No. of Downloads

1

Rcpp

1960

2

ggplot2

1785

3

digest

1709

4

reshape2

1651

5

plyr

1634

6

rJava

1577

7

stringr

1549

8

RColorBrewer

1497

9

colorspace

1372

10

manipulate

1363

11

scales

1347

12

labeling

1320

13

proto

1301

14

munsell

1291

15

gtable

1290

16

dichromat

1289

17

RCurl

1144

18

zoo

1085

19

mime

1038

20

RcppEigen

1033

 

We also decided to then analyze Stack Overflow data to understand most discussed packages and analyze the one with the most questions & unanswered questions.

GGPlot was ranked first with the most questions at ~7200 questions followed by Data table (2135), Plyr (1213) & Knitr (1136). Other packages had less than 1000 questions each. We also looked at the unanswered questions. The packages with the highest percentage of unanswered packages were Knitr, Lattice and iGraph at 24.2%, 23.5% and 19.8% respectively. 

Comparing the top downloaded packages with the most discussed packages shows little correlations between them. For Instance, GGplot has the most questions asked & is the second highest downloaded package but “Data.Table” package (the second highest ranked R package for questions asked) is not even in the top 100 packages downloaded. Knitr is another example which is in the top 5 questions asked, but is 27th ranked in downloaded packages.

So- does the R community need to focus on packages that have the highest questions to resolve their issues rather than the ones with the most downloads?

===

Guest Post by Vozag

Views: 3967

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Peter Tomany on April 8, 2015 at 7:38am

Part of the reason could be that the Data.Table and Knitr packages are used in some popular MOOCs (I know that they are both used in the Coursera Data Science MOOC, for instance) and that this results in a disproportionately high number of questions related to these packages from relative beginners in R.

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service