8 Documentation

Package documentation is important for users to understand how to work with your code.

Bioconductor requires:

  • a vignette with executable code that demonstrates how to use the package to accomplish a task,
  • man pages for all exported functions with runnable examples, well documented data structures especially if not a pre-exiting class
  • well documented datasets for data provided in data/ and in inst/extdata/.

References to the methods used as well as to other similar or related projects and packages is also expected.

If data structures differ from similar packages, Bioconductor reviewers will expect some justification as to why. Keep in mind it is always possible to extend existing classes.

8.1 Vignettes

A vignette demonstrates how to accomplish non-trivial tasks embodying the core functionality of your package. There are two common types of vignettes.

  • A Sweave vignette is an .Rnw file that contains \(\LaTeX\) and chunks of code. The code chunk starts with a line <<>>=, and ends with @. Each chunk is evaluated during R CMD build, prior to \(\LaTeX\) compilation to a PDF document.
  • An R markdown vignette is similar to a Sweave vignette, but uses markdown instead of \(\LaTeX\) for structuring text sections and resulting in HTML output. The knitr package can process most Sweave and all R markdown vignettes, producing pleasing output. Refer to Writing package vignettes for technical details. See the BiocStyle package for a convenient way to use common macros and a standard style.

A vignette provides reproducibility: the vignette produces the same results as copying the corresponding commands into an session. It is therefore essential that the vignette embed executed code. Shortcuts (e.g., using a \(\LaTeX\) verbatim environment, or using the Sweave eval=FALSE flag, or equivalent tricks in markdown) undermine the benefit of vignettes and are generally not allowed; exceptions can be made with proper justification and are at the discretion of Bioconductor reviewers.

All packages are required to have at least one vignette. Vignettes go in the vignettes/ directory of the package. Vignettes are often used as standalone documents, so best practices are to include an informative title, the primary author of the vignette, the last modification date of the vignette, and a link to the package landing page. We encourage the use of BiocStyle for formatting.

Some best practices for writing Bioconductor vignettes are detailed in the following sections.

8.1.1 Introduction

Add an “Introduction” section that serves as an abstract to introduce the objective, models, unique functions, key points, etc that distinguish the package from other packages in the same area.

8.1.2 Installation

Add an “Installation” section that show to users how to download and load the package from Bioconductor.

These instructions and any installations instructions should be in an eval=FALSE code chunk. No where in the code ( code, man pages, vignettes, Rmd files) should someone try to install or download system dependencies, applications, packages, etc. Developers can provide instructions to follow in unevaluated code chunks, and should assume all necessary dependencies, applications or packages are already set up on a user’s system.

8.1.3 Table of contents

If appropriate, we strongly encourage a table of contents

8.1.4 Evaluated code chunks

Non-trival executable code is a must!!!

Static vignettes are not acceptable.

8.1.5 Session information

Include a section with the SessionInfo()

8.1.6 vignettes/ directory and intermediate files

Only the source vignette file (.Rnw or .Rmd) and any necessary static images should be in the vignette directory. No intermediate files should be present.

8.1.7 References

Remember to include any relevant references to methods.

8.2 Man pages

See the Writing R Extensions section on man pages for detailed instruction or format information for documenting a package, functions, classes, and data sets.

All help pages should be comprehensive.

8.2.1 Functions and classes

All exported functions and classes need will have a man page. Man pages describing new classes must be very detailed on the structure and the type of information that is stored.

8.2.2 Package-level documentation

Bioconductor encourages having a package man page with an overview of the package and links to the main functions.

8.2.3 Data

Data man pages must include provenance information and data structure information.

8.2.4 Examples

All man pages should have an runnable examples.

The use of donttest and dontrun is discouraged and generally not allowed; exceptions can be made with proper justification and are at the discretion of Bioconductor reviewers.

If this option is used it will also be preferable to use donttest instead of dontrun; donttest requires valid code while dontrun does not.

8.3 The inst/script/ directory

The scripts in this directory can vary.

Most importantly if data was included in the inst/extdata/ directory, a related script must be present in this directory documenting very clearly how the data was generated.

It should include source URLs and any key information regarding filtering or processing.

It can be executable code, sudo code, or a text description.

Users should be able to download and be able to roughly reproduce the file or object that is present as data.