The R Ecosystem

Where the packages live

Repository Curation Type VCS Installer
CRAN curated Distr SVN install.packages
Github open Devel git devtools::install_github
Bioconductor curated Both git BiocManager::install
R-forge open? Devel? SVN install.packages with repo set

Comprehensive R Archive Network

Github

  • Get an account (Help)
  • Start a repository
  • Set up an R package (using devtools) and set origin
  • Done

Bioconductor | R for bioinformatics

  • Read the Package guidelines and Coding style

  • Start a repository on Github

  • Write & test your package

  • Submit your package repo to the Bioconductor Contributions repo (Submission process )

  • Maintain:

    • A development branch (master)
    • Current release branch (RELEASE_x_y, currently 3.10)
  • Focused on high-throughput genomic analysis

  • Tightly coupled package versions with bi-annual releases

    • Use their own installer, package BiocManager (on CRAN)
    • Regular, often massive re-installations
  • Run their own git server, but recommend keeping a sync on Github (two remotes)

  • Nice developer ressources

R-forge

  • Run by r-project.org

  • Installation via

install.packages("mypackage", repos = c("http://R-Forge.R-project.org",
"http://your.nearest.CRAN.mirror.org"), dependencies = TRUE)
  • Workflow: ?? Little info visible (without an account?)

Working with other packages

Can mean:

  • importing functionality from installed packages

  • adding code, modifications to existing packages which are not yours

  • including code from other packages into your own

Technical, legal, social, philosophical implications

General advice:

  • Don’t worry too much (DWTM)
  • Rapid prototyping, iterative improvement
  • “Quality of life, quality of research (code)”

Depends, Imports, Suggests, Enhances: use one

Your package uses another installed package to do something:

  • Your package: mypkg
  • Other package: otherpkg

Depends:

  • otherpkg necessary for installing mypkg
  • After library(mypkg), otherpkg will be in the search path for the user
  • Use case: mypkg really only makes sense in the context of otherpkg, e.g. latticeExtra and lattice
  • Recommend: use rarely

Imports:

  • otherpkg necessary for installing mypkg
  • Functions etc. in otherpkg will be used internally by mypkg
  • Details can be controlled in NAMESPACE
  • Recommend: use this for necessary packages

Suggests:

  • otherpkg NOT necessary for installing mypkg
  • For packages only used in examples, vignettes, tests
  • Typical use case: example data from another package

Enhances:

  • otherpkg NOT necessary for installing mypkg
  • When mypkg adds functionality to otherpkg
  • Use case: mypkg has extra methods for classes defined in otherpkg

Reference: Writing R Extensions, Section 1.1.3

NAMESPACE specification

Reference: Writing R Extensions, Section 1.5

Exports

  • Only exported functions are visible after library(mypkg) or via mypkg::

  • Unexported functions are available as mypkg:::

  • Suggestion:

    • start with default (export all, exportPattern("^[^\\.]")),
    • add @export-tags before release
    • don’t export hacks, functions you don’t document

Imports

  • import(otherpkg) in NAMESPACE corresponds to library(otherpkg) in linear code

  • Qualified names otherpkg:: will work without import/importFrom

  • Suggestion:

    • @ìmport important packages at start (replace library)
    • use qualified names during development
    • switch to @import when you get tired of otherpkg::
    • before release, check whether to replace @import with @importFrom
    • use qualified names when in doubt

Contributing to other packages

  • Report issues: informatively, politely (minimal replicable example)

  • Add fixes via forking:

    • establish a private copy of a package (dead easy on Github)

    • make modifications

    • offer modifications to package owner (pull request)

    • Alt.: if the licence allows, go rogue with your own (forked) version

      • Nuclear option
      • Give credit
      • Change names, trademarks
  • Play nice: e.g. ggplot2 code of conduct

Ressources:

Using material from other packages in your own

  • Legal under open source licence under certain conditions

    • E.g.: “viral” property of GPL
  • For legal & ethical reasons: give credit

We have to talk about licenses

Software licensing matters in research, same as data protection & citation rules.

Licensing R: not a legal document, but

  • overview of current practice
  • overview of open source (vs proprietary) licenses

Do I have to share?

  • No.

  • But you should think about it:

    • private vs public package: who to share with?
    • open vs proprietary licence
    • open access and open source

Have fun