This package grew out of a series of meetings to enhance our understanding of R as a programming language, and to connect with other people interested in R software development. It is suitable for people who already know R, and are eager to develop a deeper understanding of the language and ‘best practices’ when tackling larger projects.

A first package

The main objectives of this session are to write an R package, and to use github for version control.

R packages

An R package is nothing more than an on-disk folder contains specific files and folders. A minimal package contains

  1. A text DESCRIPTION file with metadata (e.g., title, author, description, license, package dependencies).
  2. A text NAMESPACE file. The NAMESPACE file describes the packages, functions, methods, and classes imported (used by) the R code in the package. The NAMESPACE file also specifies which functions, classes, methods, and data defined in the package are visible to the user. A minimal NAMESPACE file contains nothing – no symbols are used from other packages, and no symbols defined in the package are visible to the user.

Exercise

Create a nearly minimal R package using the devtools function create. We’ll create the pacakge in a temporary directory, but do this in a permanent location.

pkg <- file.path(tempdir(), "Elbo")
devtools::create(pkg, description=list(Version="0.0.1"))
## Creating package 'Elbo' in '/tmp/RtmpFFzU5a'
## No DESCRIPTION found. Creating with values:
## Package: Elbo
## Title: What the Package Does (one line, title case)
## Version: 0.0.1
## Authors@R: person("First", "Last", email = "[email protected]", role = c("aut", "cre"))
## Description: What the package does (one paragraph).
## Depends: R (>= 3.4.0)
## License: What license is it under?
## Encoding: UTF-8
## LazyData: true
## * Creating `Elbo.Rproj` from template.
## * Adding `.Rproj.user`, `.Rhistory`, `.RData` to ./.gitignore

Check out the content of the directory

dir(pkg, recursive=TRUE)
## [1] "DESCRIPTION" "Elbo.Rproj"  "NAMESPACE"

The Elbo.Rproj file is for use with RStudio, and is optional. The content of the DESCRIPTION file is

cat(strwrap({
    readLines(file.path(pkg, "DESCRIPTION"))
}, exdent=4), sep="\n")
## Package: Elbo
## Title: What the Package Does (one line, title case)
## Version: 0.0.1
## Authors@R: person("First", "Last", email =
##     "[email protected]", role = c("aut", "cre"))
## Description: What the package does (one paragraph).
## Depends: R (>= 3.4.0)
## License: What license is it under?
## Encoding: UTF-8
## LazyData: true

Usually, one would edit the file:

  • The Title: and Description: fields to describe the package.
  • The Authors@R: field provides a facility for enumerating one or several authors.
  • The Version: field is meant to be incremented with each change to the package; it allows users and developers to know precisely which version of the package is in use. The Bioconductor convention is to use versions with format x.y.z, e.g., 0.0.1, 0.99.0, 1.0.0, 1.0.1
  • The License: field is used to describe the conditions under which the pacakge is made available; one often chooses a standard license, e.g., ‘Artistic-2.0’ or ‘GPL 3’

The package can actually be installed and loaded, e.g., using devtools

devtools::install(pkg)
## Installing Elbo
## '/home/mtmorgan/bin/R-devel/bin/R' --no-site-file --no-environ --no-save  \
##   --no-restore --quiet CMD INSTALL '/tmp/RtmpFFzU5a/Elbo'  \
##   --library='/home/mtmorgan/R/x86_64-pc-linux-gnu-library/3.4-Bioc-3.5'  \
##   --install-tests
## 
library(Elbo)

The package is now on the search path (at the second position)

head(search())
## [1] ".GlobalEnv"            "package:Elbo"          "package:roxygen2"     
## [4] "package:devtools"      "package:BiocInstaller" "package:stats"

but of course has no symbols available

ls(pos=2)
## character(0)

Version control

During software development, one wants to be able to make changes to a package in a way that allows one to easily record what one has done, to revert back to a previous ‘working’ version if one ends up going down the wrong path, and to share the package with colleagues.

Version control enables each of these objects. git is one version control system, it’s made especially useful through the github website.

Each package we create will be managed as a github repository.

There are two important steps taken when adding your package to the repository.

  1. Commit to the local repository, git commit .... One often makes a number of commits to the local repository, perhaps after every meaningful bit of code is produced – several times an hour, for instance.
  2. Push a series of commits to the github repository git push .... This is often associate with the completion of a conceptual feature, or perhaps at the end of the day.

Package development: a first function

R packages typically consist of R functions. To add a function to your package,

dir.create(file.path(pkg, "R"))
## Warning in dir.create(file.path(pkg, "R")): '/tmp/RtmpFFzU5a/Elbo/R'
## already exists
file.create(file.path(pkg, "R", "hi.R"))
## [1] TRUE

We’ll add the following function to the hi.R file:

hi <- function(who) {
    paste("hello", who, "you have", nchar(who), "letters in your name!")
}

A common practice uses the roxygen2 package to help document and manage functions in packages. The idea is to include a few lines of text above the function, using tags such as @param to indicate different parts of the documentation. Update the hi.R package, using a text editor of your choice, to read as follows:

#' Help start conversations
#'
#' This function generates some helpful text that can be used to start
#' conversations in all kinds of awkward social situations.
#'
#' @param who character(1) The name of the person you wish to start a
#'     conversation with.
#'
#' @return character(1) A line of text to be used when starting conversations.
#'
#' @examples
#' hi("Martin Morgan")
#'
#' @export
hi <- function(who) {
    paste("hello", who, "you have", nchar(who), "letters in your name!")
}
## [1] TRUE

Lines starting with #' are recognized by roxygen2. The first line in this block becomes the help page title. The next paragraph is the description found on all help pages.

We’re now ready to compile the documentation

devtools::document(pkg)

and install the updated package

devtools::install(pkg)

Load (if necessary) the newly installed package and check out the functionality

library(Elbo)
head(search())
ls(pos=2)
hi("Martin")

Also check out our documentation

?hi

Version control!

We’ve made some changes, so…

  • Update the version number, e.g., to 0.0.2
  • Add the new files to git
  • Review changes
  • Commit all changed files, including the updated DESCRIPTION file
  • Push the changes to github.
## cd Elbo
## update Version in DESCRIPTION
git add R/hi.R
git diff
git commit -a
git push

These operations are also available via the RStudio user interface.

Exercise

Add an option how that will shout (upper case), whisper (lowercase), or say (no change) the user name. Implement the option as separate functions. Use match.arg() to select the mode of communication, and switch() to return the function to be applied when creating the return sentence. Update the documentation and add examples to the man page.

My solution:

I implemented the code as

hi <- function(who, how=c("say", "shout", "whisper")) {
    how <- match.arg(how)
    fun <- switch(how, say=say, shout=shout, whisper=whisper)
    paste("hello", fun(who),  "you have", nchar(who), "letters in your name!")
}

say <- function(who) {
    who
}

shout <- function(who) {
    toupper(who)
}

whisper <- function(who) {
    tolower(who)
}

I updated the documentation by adding a @param

#' @param how character(1) How to greet the conversant. Either "say"
#'     (default, no change), "shout" (upper-case), "whisper" (lower-case)

…and updating the @examples

#' @examples
#' hi("Martin Morgan")
#' hi("Martin Morgan", "shout")
#' hi("Martin Morgan", "whisper")

Update and install the package, and test it

devtools::document(pkg)
devtools::install(pkg)
example(hi)

Review and commit changes and push to github

## update Version in DESCRIPTION
git diff
git commit -a
git push

Exercise

Since functions are objects in R, implement the how argument so that it takes a function rather than character(1). What are the strenghts and weaknesses of these approaches?

Unit tests and other programming best practices