Dependency management with Packrat

Packrat is a dependency management system for R developed by RStudio. This provides an isolated, portable and reproducible R environment for each project. Packrat is included by default in the notebooks offered within Datalabs (Jupyter and Zeppelin). Below is a quick summary regarding the use of packrat in the datalabs environment, further information about this package can be found here.

Quick-start guide

Initialising a new project

To use Packrat to manage the R libraries for a project first set the working directory and run the packrat::init command. This initialisation step is only required to be run once to set-up a private library to store the libraries required for the project.

setwd('/data/example_project')
packrat::init()

Opening a Packrat managed project

Once initialised, a project can be opened using the packrat::on function. This will set the private project library for installing and opening of packages. The default global library can be restored by running the packrat::off command. When using Spark the clean.search.path = FALSE argument should be given to the on function, this prevents unloading the SparkR library (see here for more information).

setwd('/data/example_project')
packrat::on()

Installing a package

An R package can be installed in the private project library using the base install.packages function. The project lockfile (used to restore libraries) can then be updated by running packrat::snapshot.

install.packages('fortunes')
packrat::snapshot()

Installing a package specific version

Packrat can additionally manage packages installed via the devtools library. This allows for the installation of specific package versions from CRAN and development versions from GitHub.

# From CRAN
devtools::install_version('zoo', version='1.7-14')
packrat::snapshot()

# From GitHub - userName/repoName@version
devtools::install_github('IRkernel/IRkernel@0.8.8')
packrat::snapshot()

Removing a package

Packaged that are no longer required can be deleted using the remove.packages function, this will remove the package from the private project library. The project lockfile can then be updated using the packrat::snapshot command.

remove.packages('fortunes')
packrat::snapshot()

Restore project packages

The R dependencies managed by packrat are recorded in a lockfile, this includes details on the source and version of the installed packages. When calling packrat::restore the project library is updated to reflect the lockfile. This can be used to maintain a exact copy of your working R set-up and can be included with version control.

This functionality is especially useful within DataLabs when using a project on an alternative notebook type or when needing a library within Spark. For more information see here.

packrat::restore()