Extensions to the R statistical programming language
R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN (the Comprehensive R Archive Network).[1][2] The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor driving the widespread adoption of the language in data science.[3][4][5][6]
Compared to libraries in other programming languages, R packages must conform to a relatively strict specification.[3] The Writing R Extensions manual[7] specifies a standard directory structure for R source code, data, documentation, and package metadata, which enables them to be installed and loaded using R's in-built package management tools.[3] Packages distributed on CRAN must meet additional standards.[3][8] According to John Chambers, whilst these requirements "impose considerable demands" on package developers, they improve the usability and long-term stability of packages for end users.[3]
The "Task Views" page (subject list) on the CRAN website[16] lists a wide range of tasks (in fields such as finance, genetics, high performance computing, machine learning, medical imaging, meta-analysis, social sciences and spatial statistics) for which R packages are available. Another way to browse CRAN packages is provided by Metacran,[17] which also maintains lists of featured, most downloaded, trending or most depended upon packages.
The number of CRAN packages has grown exponentially for many years,[18] and as of 2018[update] an average of 21 submissions of new or updated packages were made every day.[6] Since each submission is manually reviewed by a small team of CRAN maintainers, many of whom, according to R core developer Peter Dalgaard, are "approaching pensionable age", there is a concern that this system is not sustainable in the long term.[6] The growth of CRAN has exposed limitations of its dependency management infrastructure, particularly the fact that it assumes that dependencies always refer to the latest version of a package, meaning that new releases of CRAN packages must always be backwards compatible,[19] and that CRAN packages cannot have dependencies that are not on CRAN.[20] It has also led to concerns about declining quality of packages.[21]
The Microsoft R Application Network (MRAN) is a mirror of CRAN maintained by Microsoft which is based on the company's downstream distribution of R, Microsoft R Open (formerly Revolution R Open).[22] It also includes an archive of daily CRAN snapshots, branded as the "CRAN Time Machine", which enables users of MRAN to bypass the dependency versioning limitations of CRAN by installing a fixed set of R package versions via the package checkpoint.[23][24] In January 2023 Microsoft announced that MRAN was being retired and the associated websites and repositories became unavailable in July 2023.[25]
The Posit Package Manager (formerly RStudio Package Manager) is a similar tool produced by the developers of RStudio which, in addition to CRAN snapshots, includes an archive of R packages from Bioconductor and Python packages from the Python Package Index.[26] It also distributes pre-compiled binary packages for Linux (only Windows and macOS binaries are included on CRAN).[27]
R-Forge,[29] is a central platform for the collaborative development of R packages, R-related software, and projects. R-Forge also hosts many unpublished beta packages, and development versions of CRAN packages.
R is distributed with fifteen "base packages": base, compiler, datasets, grDevices, graphics, grid, methods, parallel, splines, stats, stats4, tcltk, tools, translations, and utils.[30]
In addition, there are fifteen "recommended packages" from CRAN which are included with binary distributions of R: KernSmooth, MASS, Matrix, boot, class, cluster, codetools, foreign, lattice, mgcv, nlme, nnet, rpart, spatial, and survival.[30]
A group of packages called the tidyverse, which can be considered a "dialect of the R language", is increasingly popular in the R ecosystem. As of 2020-06-13, Metacran[17] listed 7 of the 8 core packages of the tidyverse in the list of most downloaded R packages. The group of packages strives to provide a cohesive collection of functions to deal with common data science tasks, including data import, cleaning, transformation and visualisation (notably with the ggplot2 package).
The R Infrastructure packages[31] support coding and the development of R packages and as of 2021-05-04, Metacran[17] lists 16 of these packages among the 25 most downloaded packages.
^Hornik, Kurt (2020-02-20). "Frequently Asked Questions on R". The Comprehensive R Archive Network. 7.29: What is the difference between package and library?. Archived from the original on 2011-07-09. Retrieved 2 November 2020.{{cite web}}: CS1 maint: location (link)
^Wickham, Hadley; Bryan, Jennifer. "Introduction". R Packages (2nd ed.). Archived from the original on 2022-06-29. Retrieved 2020-11-02.
^ abCRAN Repository Maintainers. "CRAN Repository Policy". The Comprehensive R Archive Network. R Project. Archived from the original on 11 November 2020. Retrieved 20 November 2020.
^ abHornik, Kurt (2020-02-20). "Frequently Asked Questions on R". The Comprehensive R Archive Network. 2.1: What is CRAN?: R Project. Archived from the original on 2011-07-09. Retrieved 20 November 2020.{{cite web}}: CS1 maint: location (link)
^CRAN Repository Maintainers. "CRAN - Contributed Packages". The Comprehensive R Archive Network. CRAN. Archived from the original on 24 November 2020. Retrieved 20 November 2020.
^ abHornik, Kurt (2020-02-20). "Frequently Asked Questions on R". The Comprehensive R Archive Network. 5.1: Which add-on packages exist for R?. Archived from the original on 2011-07-09. Retrieved 2 November 2020.{{cite web}}: CS1 maint: location (link)