Technical Note - Installing Your Own R Packages

Summary

The R language is widely used across the ECS and SMS schools.

Many users who wish to make use of add-on R packages, have to wait for the package to be installed in the "R system" area before they can make use of it.

R allows users to install personal copies of add-on packages, that co-exist with its "system" packages.

This Tech Note describe a "School standard" way in which users can install and try out packages for themselves, before asking that a package be installed system-wide.

Note

This is a technical note. If you do not want the text within it wrapped and/or have no interest in navigating away via the space wasting but University-standard look and feel side bars, you can get rid of them by adding

?skin=vuwecsclean2,vuwecs,pattern

to the URL

Background: $R_LIBS

Nomenclature: An R library is a collection of R packages. Packages must first be installed into a library, then they may be loaded from that library into memory once R is running. The library() command inside R is used either to list the packages contained within the available libraries, or to load an individual package into memory from a library.

The "R Installation and Administration" manual, normally found at http://cran.r-project.org/doc/manuals/r-release/R-admin.html has a number of sections dedicated to this topic, once section of which notes an internal R variable R_LIBS_USER, which, on the Schools' ArchLinux systems returns the following mouthfull
> Sys.getenv("R_LIBS_USER")
[1] "~/R/x86_64-unknown-linux-gnu-library/3.2"

however, given the homogeneity of the Schools' ArchLinux systems, the long folder name is not required and so it is suggested that the more simple location
~/Rlibs

be used for personal packages. A further advantage of using the short name is that once a package is installed, it will still be available for subsequent upgrades of R. However note that sometimes an installed package may be incompatible with a newer version of R, but R will warn if it is necessary to upgrade the package. There is another advantage noted later in this document.

One advantage of using the default package location is that R will automatically use it if it exists, and will offer to create it when you try to install your first package.

How R finds installed packages

On the Schools' Linux systems, an environmental variable R_LIBS gets set to the following value /vol/R/x86_64linux/library/ in R's Renviron.site file, which has the effect of adding that directory to R's standard library search path because this is the path below which extra packages available across the whole system get installed. We'll call these school-wide packages.

In order to make use of packages installed elsewhere, you merely need to add the extra path to that environmental variable, before invoking R, eg

Bourne shell:
$ R_LIBS=~/Rlibs:/vol/R/x86_64linux/library/
$ export R_LIBS
$ R

C shell:
% setenv R_LIBS ~/Rlibs:/vol/R/x86_64linux/library/
% R

or in your personal ~/.Renviron file:
% echo "R_LIBS=~/Rlibs:/vol/R/x86_64linux/library/" > ~/.Renviron
% R

Note that in general you want your own folder location before the system ones so that you have the option to 'override' the already installed version of a package.

You can also do this inside of R itself, using the .libPaths() function

When you initially fire up R on a School workstation, the bare function shows you the current set of library search paths known to R:
$ R
>  .libPaths()
[1] "/am/embassy/vol/x6/stats/R/x86_64linux/library"           
[2] "/am/roxy/home/fred/R/x86_64-unknown-linux-gnu-library/3.2"
[3] "/usr/pkg/lib/R/library"                                   
>

and it might be instructive to take a look at that path to see what's provided by the system
$ ls /usr/pkg/lib/R/library
KernSmooth/  cluster/    graphics/  nnet/         splines/   translations/
MASS/        codetools/  grid/      parallel/     stats/     utils/
Matrix/      compiler/   lattice/   rkward/       stats4/
base/        datasets/   methods/   rkwardtests/  survival/
boot/        foreign/    mgcv/      rpart/        tcltk/
class/       grDevices/  nlme/      spatial/      tools/

Be aware that R doesn't recognise an automounted 'alias' and so the

/am/embassy/vol/x6/stats/R/x86_64linux/library

is really the path we know as

/vol/R/x86_64linux/library

mentioned above.

If you alter your environment using the personal customisation, you'll see (in Bourne shell e.g.)
$ R
> .libPaths()
[1] "/am/roxy/home/fred/Rlibs"                     
[2] "/am/embassy/vol/x6/stats/R/x86_64linux/library"
[3] "/am/roxy/home/fred/R/x86_64-unknown-linux-gnu-library/3.2"
[4] "/usr/pkg/lib/R/library"           

although, once again, the nature of automounted file systems make it less clear than it might be that ~/Rlibs has been added as a library search path.

Finally, here's an example of how the libPaths() function, when supplied with a parameter, can manipulate the library search path, from inside R.
$ R
>  .libPaths()
[1] "/am/embassy/vol/x6/stats/R/x86_64linux/library"
[2] "/usr/pkg/lib/R/library"                        
>

then specify our personal path
> .libPaths("~/Rlibs")
> .libPaths()
[1] "/am/roxy/home/fred/Rlibs" "/usr/pkg/lib/R/library"   
>

but note that this has removed the path that the Renviron.site added, so to avoid this, use
> .libPaths(c("~/Rlibs", .libPaths()))
> .libPaths()
[1] "/am/roxy/home/fred/Rlibs"
[2] "/am/embassy/vol/x6/stats/R/x86_64linux/library"
[3] "/usr/pkg/lib/R/library"  
>

Note: If one wishes to use a 'personal' or updated version of a package in preference to the school-wide installed package but still have most school-wide packages available, one cannot use the default location of $R_LIBS_USER because the school-wide path provided by $R_LIBS takes precedence over the $R_LIBS_USER path. Thus this is another advantage of using a non-default personal folder name and specifying it either via the R_LIBS environment variable, or via .libPaths() as above.

How to install R packages

There are two basic ways to install packages;
  1. install from a package file from outside of R
  2. use R to download and install the package

The first method is required if one has modified an existing package or has downloaded a package from somebody's personal web site. The second method is preferable, since it will always install the most recent update of a package for whichever version of R is running, and will also install any package dependencies (i.e. other packages that the primary package requires).

Installing from a downloaded file

On a Linux system, R packages are always installed from source. A source package for Linux will be of the form pkg_1.23-45.tar.gz. Note that while at the Linux level a file named pkg_1.23-45.tgz is expected to be identical, this latter is typically used for the name of a Macintosh binary package file for R.

The procedure for installing such a package (in a C shell) is:
% setenv R_LIBS ~/Rlibs:$R_LIBS
% R CMD INSTALL pkg_1.23-45.tar.gz

This will check that required packages ("dependencies") are already installed and then compile (if necessary) and install the named package into the users private area.

Note that if this is the first package you are installing in the named library, the folder (~/Rlibs in this case), must pre-exist.

An alternative one-line method is as follows:
% R CMD INSTALL --library="~/Rlibs" pkg_1.23-45.tar.gz

Note that with this method, all dependencies would have to be available either in the system library or the user's private library; in particular the school-wide library is not available.

Installing from within R

R can be set up very easily to install or update packages that can be found on a CRAN website. It is theoretically possible to use locations other than CRAN or its mirrors, but the information required to set that up is beyond the brief of this technical note.
% R
> Sys.setenv(http_proxy=
  "http://www-cache.ecs.vuw.ac.nz:8080/")
> options(repos="http://cran.stat.auckland.ac.nz/")
> install.packages("<primary package name>",
  lib="~/Rlibs")
>

Once again, The "R Installation and Administration" manual, normally found at http://cran.r-project.org/doc/manuals/r-release/R-admin.html may be of use, if you wish to go beyond the brief of this technical note.