Technical Note - Installing Your Own R Packages
Summary
The R language is widely used across the ECS and SMS schools.
Many users who wish to make use of add-on R packages, have to wait for the
package to be installed in the "R system" area before they can make use of it.
R allows users to install personal copies of add-on packages, that co-exist with
its "system" packages.
This Tech Note describe a "School standard" way in which users can install
and try out packages for themselves, before asking that a package be installed
system-wide.
Note
This is a technical note. If you do not want the text within it wrapped and/or have
no interest in navigating away via the space wasting but University-standard look
and feel side bars, you can get rid of them by adding
?skin=vuwecsclean2,vuwecs,pattern
to the URL
Background: $R_LIBS
Nomenclature: An R
library is a collection of R
packages. Packages must first be
installed into a library, then they may be
loaded from that library into memory once R is running. The
library()
command inside R is used either to list the packages contained within the available libraries, or to load an individual package into memory from a library.
The "R Installation and Administration" manual, normally found at
http://cran.r-project.org/doc/manuals/r-release/R-admin.html
has a number of sections dedicated to this topic, once section of which
notes an internal R variable
R_LIBS_USER
, which, on the Schools'
ArchLinux systems returns the following mouthfull
> Sys.getenv("R_LIBS_USER")
[1] "~/R/x86_64-unknown-linux-gnu-library/3.2"
however, given the homogeneity of the Schools' ArchLinux systems, the long folder name is not required and so it is
suggested that the more simple location
~/Rlibs
be used for personal packages. A further advantage of using the short name is that once a package is installed, it will still be available for subsequent upgrades of R. However note that sometimes an installed package may be incompatible with a newer version of R, but R will warn if it is necessary to upgrade the package. There is another advantage noted later in this document.
One advantage of using the default package location is that R will automatically use it if it exists, and will offer to create it when you try to install your first package.
How R finds installed packages
On the Schools' Linux systems, an environmental variable
R_LIBS
gets set
to the following value
/vol/R/x86_64linux/library/
in R's
Renviron.site
file,
which has the effect of adding that directory to R's standard library search path
because this is the path below which extra packages available across the whole system
get installed. We'll call these school-wide packages.
In order to make use of packages installed elsewhere, you merely need to
add the extra path to that environmental variable, before invoking R, eg
Bourne shell:
$ R_LIBS=~/Rlibs:/vol/R/x86_64linux/library/
$ export R_LIBS
$ R
C shell:
% setenv R_LIBS ~/Rlibs:/vol/R/x86_64linux/library/
% R
or in your personal
~/.Renviron
file:
% echo "R_LIBS=~/Rlibs:/vol/R/x86_64linux/library/" > ~/.Renviron
% R
Note that in general you want your own folder location before the system ones so that you have the option to 'override' the already installed version of a package.
You can also do this inside of R itself, using the
.libPaths()
function
When you initially fire up R on a School workstation,
the bare function shows you the current set of library search paths known to R:
$ R
> .libPaths()
[1] "/am/embassy/vol/x6/stats/R/x86_64linux/library"
[2] "/am/roxy/home/fred/R/x86_64-unknown-linux-gnu-library/3.2"
[3] "/usr/pkg/lib/R/library"
>
and it might be instructive to take a look at that path to see what's provided by the system
$ ls /usr/pkg/lib/R/library
KernSmooth/ cluster/ graphics/ nnet/ splines/ translations/
MASS/ codetools/ grid/ parallel/ stats/ utils/
Matrix/ compiler/ lattice/ rkward/ stats4/
base/ datasets/ methods/ rkwardtests/ survival/
boot/ foreign/ mgcv/ rpart/ tcltk/
class/ grDevices/ nlme/ spatial/ tools/
Be aware that R doesn't recognise an automounted 'alias' and so the
/am/embassy/vol/x6/stats/R/x86_64linux/library
is really the path we know as
/vol/R/x86_64linux/library
mentioned above.
If you alter your environment using the personal customisation, you'll see (in Bourne shell e.g.)
$ R
> .libPaths()
[1] "/am/roxy/home/fred/Rlibs"
[2] "/am/embassy/vol/x6/stats/R/x86_64linux/library"
[3] "/am/roxy/home/fred/R/x86_64-unknown-linux-gnu-library/3.2"
[4] "/usr/pkg/lib/R/library"
although, once again, the nature of automounted file systems make it less clear than it
might be that
~/Rlibs
has been added as a library search path.
Finally, here's an example of how the
libPaths()
function, when supplied with a parameter,
can manipulate the library search path, from inside R.
$ R
> .libPaths()
[1] "/am/embassy/vol/x6/stats/R/x86_64linux/library"
[2] "/usr/pkg/lib/R/library"
>
then specify our personal path
> .libPaths("~/Rlibs")
> .libPaths()
[1] "/am/roxy/home/fred/Rlibs" "/usr/pkg/lib/R/library"
>
but note that this has removed the path that the
Renviron.site
added, so to avoid this, use
> .libPaths(c("~/Rlibs", .libPaths()))
> .libPaths()
[1] "/am/roxy/home/fred/Rlibs"
[2] "/am/embassy/vol/x6/stats/R/x86_64linux/library"
[3] "/usr/pkg/lib/R/library"
>
Note: If one wishes to use a 'personal' or updated version of a package in preference to the school-wide installed package but still have most school-wide packages available, one cannot use the default location of
$R_LIBS_USER
because the school-wide path provided by
$R_LIBS
takes precedence over the
$R_LIBS_USER
path. Thus this is another advantage of using a non-default personal folder name and specifying it either via the
R_LIBS
environment variable, or via
.libPaths()
as above.
How to install R packages
There are two basic ways to install packages;
- install from a package file from outside of R
- use R to download and install the package
The first method is required if one has modified an existing package or has downloaded a package from somebody's personal web site. The second method is preferable, since it will always install the most recent update of a package for whichever version of R is running, and will also install any package dependencies (i.e. other packages that the primary package requires).
Installing from a downloaded file
On a Linux system, R packages are always installed from source. A source package for Linux will be of the form
pkg_1.23-45.tar.gz
. Note that while at the Linux level a file named
pkg_1.23-45.tgz
is expected to be identical, this latter is typically used for the name of a Macintosh
binary package file for R.
The procedure for installing such a package (in a C shell) is:
% setenv R_LIBS ~/Rlibs:$R_LIBS
% R CMD INSTALL pkg_1.23-45.tar.gz
This will check that required packages ("dependencies") are already installed and then compile (if necessary) and install the named package into the users private area.
Note that if this is the first package you are installing in the named library, the folder (
~/Rlibs
in this case), must pre-exist.
An alternative one-line method is as follows:
% R CMD INSTALL --library="~/Rlibs" pkg_1.23-45.tar.gz
Note that with this method, all dependencies would have to be available either in the system library or the user's private library; in particular the school-wide library is not available.
Installing from within R
R can be set up very easily to install or update packages that can be found on a CRAN website. It is theoretically possible to use locations other than CRAN or its mirrors, but the information required to set that up is beyond the brief of this technical note.
% R
> Sys.setenv(http_proxy=
"http://www-cache.ecs.vuw.ac.nz:8080/")
> options(repos="http://cran.stat.auckland.ac.nz/")
> install.packages("<primary package name>",
lib="~/Rlibs")
>
Once again, The "R Installation and Administration" manual, normally found at
http://cran.r-project.org/doc/manuals/r-release/R-admin.html
may be of use, if you wish to go beyond the brief of this technical note.