Packaging data analytical work reproducibly

Marwick, B., Boettiger, C., & Mullen, L. 2017 Packaging data analytical work reproducibly using R (and friends). The American Statistician. DOI: 10.1080/00031305.2017.1375986

Computers are a central tool in the research process, enabling complex and large scale data analysis. As computer-based research has increased in complexity, so have the challenges of ensuring that this research is reproducible. To address this challenge, we review the concept of the research compendium as a solution for providing a standard and easily recognisable way for organising the digital materials of a research project to enable other researchers to inspect, reproduce, and extend the research. We investigate how the structure and tooling of software packages of the R programming language are being used to produce research compendia in a variety of disciplines. We also describe how software engineering tools and services are being used by researchers to streamline working with research compendia. Using real-world examples, we show how researchers can improve the reproducibility of their work using research compendia based on R packages and related tools.

Keywords: Reproducible research, Open source software, Computational science, Data science

People Involved: 
Status of Research or Work: 
Completed/published
Research Type: