Reproducible vs. non-reproducible analysis

Written Apr 17, 2017

Below are some characteristics of a reproducible analysis, in contrast to a non-reproducible analysis.

Reproducible:

  • You can rerun the entire analysis, from data download to full output/report generation, by clicking one “button” (running one command)
  • If data sources or other things change upstream, you only need to tweak a few things

Non-reproducible:

  • Drag and drop interfaces (at least very hard to maintain as reproducible)
  • Naked program files
    • That is, R, SAS, Python, or other program files, without accompanying shell script or Makefile to delineate program run order

This post may be updated regularly!