1 Packages

1.1 What and Why?

What?

  • A simple directory structure with text files.
  • DESCRIPTION: title, author, version, license, etc.
  • NAMESPACE: functions used by and made available by your package
  • R/: function defintions
  • man/: help pages
  • vignettes/: vignettes
  • tests/: code to test your package

Why?

  • Organize an analysis.
  • Share reproducible code with lab mates, colleagues, …

Minimal

$ tree MyPackage
MyPackage
└── DESCRIPTION

0 directories, 1 file
$ cat MyPackage/DESCRIPTION 
Package: MyPackage
Type: Package
Version: 0.0.1
Author: Martin Morgan
Maintainer: Martin Morgan <[email protected]>
Title: A Minimal Package
Description: An abstract-like description of the package.
License: Artistic-2.0

Typical

$ tree
.
└── MyPackage
    ├── DESCRIPTION
    ├── man
    │   └── hi.Rd
    ├── NAMESPACE
    ├── R
    │   └── hi.R
    ├── tests
    │   ├── testthat
    │   │   └── test_hi.R
    │   └── testthat.R
    └── vignettes
        └── MyPackage.Rmd

6 directories, 7 files

1.2 Working with packages

Build

$ R CMD build MyPackage
* checking for file 'MyPackage/DESCRIPTION' ... OK
* preparing 'MyPackage':
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* creating default NAMESPACE file
* building 'MyPackage_0.0.1.tar.gz'

Check

$ R CMD check MyPackage_0.0.1.tar.gz 
* using log directory '/home/mtmorgan/a/BiocIntro/vignettes/MyPackage.Rcheck'
* using R version 3.4.2 Patched (2017-10-12 r73550)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* checking for file 'MyPackage/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'MyPackage' version '0.0.1'
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package 'MyPackage' can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking examples ... NONE
* checking PDF version of manual ... OK
* DONE

Status: OK

Install

$ R CMD INSTALL MyPackage_0.0.1.tar.gz 
* installing to library '/home/mtmorgan/R/x86_64-pc-linux-gnu-library/3.4-Bioc-3.6'
* installing *source* package 'MyPackage' ...
** help
No man pages found in package  'MyPackage' 
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (MyPackage)

2 Package Development

2.1 Olde School

  • Add new functions in files R/foo.R
  • Update NAMESPACE to import functions or packages used by your function, and to export your functions that users will want to use.
  • Create man pages by hand
  • Write vignettes in LaTeX Sweave
  • Key reference: Writing R Extensions, RShowDoc("R-exts")

2.2 New School

  • devtools::create() a package skeleton. More flexible Authors@R instead of Author: / Maintainer: fields.
  • Use ‘roxygen’ to document functions

    • Lines starting with #' are documentation lines
    • @details, @param, @return, @examples document the function
    • @export indicates that the function should be visible to the user
    • @import and @importFrom indicate (non-base) functions that are used by this function, e.g,. @importFrom stats rnorm runif
    • devtools::document() to update documentation.
    #' Title, e.g., Say 'hi' to friends.
    #'
    #' Short description of this help page. `hi("Martin")` returns a greeting.
    #'
    #' @details A more extensive description of the functions or other objects
    #'    documented on this help page. Use `how=` to determine the nature of
    #'    the greeting.
    #'
    #' @param who character() The name(s) of the person / people to greet.
    #'
    #' @param how character(1) Whether to shout (uppercase) or whisper 
    #'     (lowercase) the greeting.
    #'
    #' @return character() of greetings, with length equal to `who`.
    #'
    #' @examples
    #' hi(c("Martin", "Jenny"), "whisper")
    #'
    #' @export
    hi <- function(who, how = c("asis", "shout", "whisper")) {
        stopifnot(
            is.character(who),
            is.character(how), missing(how) || length(how) == 1
        )
        transform <- switch(
            match.arg(how),
            asis = identity, shout = toupper, whisper = tolower
        )
    
        greet <- paste("hi", who)
        transform(greet)
    }
  • Build / check / install using devtools from an R session inside the package folder.

    getwd()  # e.g., MyPackage/
    devtools::build()
    devtools::check()
    devtools::install()
  • During development, short-circuit full round-trip with devtools::load_all()

  • Write vignettes in markdown, e.g., vignettes/MyPackage.Rmd. usethis::use_vignette() (see also BiocStyle).

    ---
    title: "Introduction to MyPackage"
    author:
    - name: Martin Morgan
      affiliation: Roswell Park Cancer Institute, Buffalo, NY
    vignette: |
      %\VignetteIndexEntry{Introduction to MyPackage}
      %\VignetteEngine{knitr::rmarkdown}
      %\VignetteEncoding{UTF-8}
    ---
    
    # Introduction
    
    This package provides an ice-breaker for getting to know
    people. It has tips for shouting or whispering to them, or just
    speaking in a normal voice. The latter is usually best for making
    friends.
    
    # Use
    
    To use this package, load it
    
    ```{r}
    library(MyPackage)
    ````
    
    and greet one or more friends.
    
    ```{r}
    hi(c("Martin", "Jenny"))
    ```
    
    Shout if your friends are hard of hearing or seem to be ignoring
    you
    
    ```{r}
    hi(c("Martin", "Jenny"), "shout")
    ```
    
    Whisper in more intimate situations or to avoid bothering others.
    
    ```{r}
    hi("Martin", "whisper")
    ```
    
    # Session Info
    
    ```{r}
    sessionInfo()
    ```
  • Write unit tests that validate the correctness of you functions. usethis::use_testthat()

    $ tree
    .
    ...
    ├── tests
    │   ├── testthat
    │   │   └── test_hi.R
    │   └── testthat.R
    ...

    Content of test_hi.R:

    test_that("hi() says hi", {
        expect_true(startsWith(hi("X"), "hi "))
        expect_true(all(startsWith(hi(LETTERS), "hi ")))
    })
    
    test_that("hi() length of 'who' equals length of output", {
        x <- "X"
        expect_equal(length(x), length(hi(x)))
    
        x <- c("X", "Y")
        expect_equal(length(x), length(hi(x)))
    
        x <- character()
        expect_equal(length(x), length(hi(x)))
    })
    
    test_that("hi() obeys 'how='", {
        expect_equal("HI X", hi("x", "shout"))
        expect_equal("hi x", hi("X", "whisper"))
        expect_equal("hi Xx", hi("Xx"))
        expect_equal("hi Xx", hi("Xx", "asis"))
    })
    
    test_that("hi() checks inputs", {
        expect_error(hi(123), "is.character\\(who\\) is not TRUE")
    
        expect_error(
            hi("X", character()),
            "missing\\(how\\) || length\\(how\\) == 1 is not TRUE"
        )
    
        expect_error(
            hi("X", "Funky"),
            "'arg' should be one of \"asis\", \"shout\", \"whisper\""
        )
    })

3 Contributing to Bioconductor

Process

Manual review

  • Open process
  • Basic code review

Common comments

  • Use portable code, e.g,. tempfile() rather than hard-coded path
  • Use robust code, e.g., seq_len(n) rather than 1:n
  • ‘Vectorize’ instead of iterate, e.g,. sqrt(x) instead of sapply(x, sqrt).
  • Re-use, e.g., rtracklayer::import.bed(), rather than re-invent.
  • Inter-operate, e.g,. use SummarizedExperiment() rather than matrix.
  • Avoid high ‘cyclomatic complexity’

    • Only one or a few paths through a function.
    • Assertions about inputs at the start of the function, not part-way through.
    • Choose and write functions that are vectorized, and that handle the edge cases, e.g., length 0 arguments or NA values, correctly (compare sapply(integer(), sqrt) with vapply(integer(), sqrt)).
  • Functions ‘fit in your head’, literally.

    • Refactor to allow re-use rather than repetition of common code.
    • Refactor to isolate logically consistent operations – testable inputs and outputs.

4 Best Practices

4.1 man pages

  • Document functions on man pages that have the same names as corresponding .R files.
  • Specify parameters (input arguments) using standard R idioms (e.g., character(1)).
  • Specify return value.
  • Easy-to-run, illustrative examples.

4.2 Vignettes

  • Overall use and interoperation with other packages / stages in the work flow.
  • ‘Toy’ data for easy reproducibility, but realistic enough to illustrate nuannces.

4.3 Unit tests

  • Use testthat or other packages.
  • Avoid using examples or vignette to test edge-cases; this just confuses the user.

4.4 Version control

  • Use git for local version control; consider github for sharing.