There are also some other key points, detailed in the following sections.
Only contain code that can be distributed under the license specified (see also The DESCRIPTION file).
Some of the more prominent offenders:
sapply(), and use the various
applyfunctions instead of
- Use of numeric indices (rather than robust named indices).
- Do not use
set.seed()in any internal code.
- Do not use
browser()in any internal code.
- Avoid the use of
- Avoid use of direct slot access with
slot(). Accessor methods should be created and utilized
- Use the packages ExperimentHub and AnnotationHub instead of downloading external data from unsanctioned providers such as GitHub, Dropbox, etc.
=for assigning variables.
- Function names should be
camelCaseor utilize the underscore
_and not have a dot
.(which indicates S3 dispatch).
dev.new()to start a graphics drive if necessary. Avoid using
X11(), for it can only be called on machines that have access to an X server.
- Use the functions
error(), instead of the
cat()function (except for customized
paste0()should generally not be used in these methods except for collapsing multiple values from a variable.
Avoid re-implementing functionality or classes (see also The DESCRIPTION file). Make use of appropriate existing packages (e.g., biomaRt, AnnotationDbi, Biostrings, GenomicRanges) and classes (e.g., SummarizedExperiment, AnnotatedDataFrame, GRanges, DNAStringSet) to avoid duplication of functionality available in other Bioconductor packages. See also Common Bioconductor Methods and Classes.
This encourages interoperability and simplifies your own package development. If a new representation is needed, see the Essential S4 interface section of Robust and Efficient Code. In general, Bioconductor will insist on interoperability with Common Classes for acceptance.
Developers should make an effort to re-use generics that fit the generic
contract for the proposed class-method pair i.e., the behavior of the method
aligns with the originally proposed behavior of the generic. Specifically,
the behavior can be one where the return value is of the same class across
methods. The method behavior can also be a performant conceptual transformation
or procedure across classes as described by the generic.
BiocGenerics lists commonly used generics in
Bioconductor. One example of a generic and method implementation is that of the
rowSums generic and the corresponding method within the
DelayedArray package. This generic contract returns a
numeric vector of the same length as the rows and is adhered to across classes
DelayedMatrix class. Re-using generics reduces the amount of
new generics by consolidating existing operations and avoids the mistake of
introducing a “new” generic with the same name. Generic name collisions may
mask or be masked by previous definitions in ways that are hard to diagnose.
We encourage maintainers to only create new methods for classes exported within
their packages. We discourage the generation of methods for external classes,
i.e., classes outside of the package
NAMESPACE. This can
potentially cause method name collisions (i.e., where two methods defined on
the same object but in different packages) and pollute the methods environment
for those external classes. New methods for established classes can also cause
confusion among users given that the new method and class definition are in
Avoid large chunks of repeated code. If code is being repeated this is generally a good indication a helper function could be implemented.
Excessively long functions should also be avoided. Write small functions.
It is best if each function has only one job that it needs to do. And it is also best if that function does that job in as few lines of code as possible. If you find yourself writing great long functions that extend for more than a screen, then you should probably take a moment to split it up into smaller helper functions.
Smaller functions are easier to read, debug and to reuse.
Argument names to functions should be descriptive and well documented. Arguments should generally have default values. Check arguments against a validity check.
Many R operations are performed on the whole object, not just the elements of the object (e.g.,
sum(x) instead of
x + x + x + ...).
In particular, relatively few situations require an explicit
See the Vectorize section of Robust and Efficient Code for additional detail.
Follow guiding principles on Querying Web Resources, if applicable.
A minimal number of cores (1 or 2) should be set as a default.
Files downloaded should be cached.
Please use BiocFileCache.
If a maintainer creates their own caching directory, it should utilize standard
tools::R_user_dir(package, which="cache"). It is not
allowed to download or write any files to a users home directory or working
directory. Files should be cached as stated above with
tempdir()/tempfile() if files should not be persistent.
Do NOT install anything on a users system.
System dependencies, applications, and additionally needed packages should be assumed already present on the user’s system.
If necessary, package maintainers should provide instructions for download and setup, but should not execute those instructions on behalf of a user.