Reproducible vs. non-reproducible analysis
Below are some characteristics of a reproducible analysis, in contrast to a non-reproducible analysis.
- You can rerun the entire analysis, from data download to full output/report generation, by clicking one “button” (running one command)
- If data sources or other things change upstream, you only need to tweak a few things
- Drag and drop interfaces (at least very hard to maintain as reproducible)
- Naked program files
- That is, R, SAS, Python, or other program files, without accompanying shell script or Makefile to delineate program run order
This post may be updated regularly!