August 30, 2017
Brian Nosek of University of Virginia and colleagues sought out to replicate 100 different studies that all were published in 2008. The project pulled these studies from three different [psychology] journals… to see if they could get the same results as the initial findings. […] Only 36.1% of the studies [were] replicated. Reproducibility Project Wikipedia entry
In May 2015 Science retracted a study of how canvassers can sway people's opinions about gay marriage published just 5 months ago.
Two Berkeley grad students who attempted to replicate the study quickly discovered that the data must have been faked.
Methods we'll discuss today can't prevent this, but they can make it easier to discover issues.
The article has been retracted at the request of the authors. After carefully re-examining the data presented in the article, they identified that data of two different hospitals got terribly mixed. The published results cannot be reproduced in accordance with scientific and clinical correctness. Authors of Low Dose Lidocaine for Refractory Seizures in Preterm Neonates
The authors informed the journal that the merge of lab results and other survey data used in the paper resulted in an error regarding the identification codes. Results of the analyses were based on the data set in which this error occurred. Further analyses established the results reported in this manuscript and interpretation of the data are not correct.
Original conclusion: "Lower levels of CSF IL-6 were associated with current depression and with future depression […]".
Revised conclusion: "Higher levels of CSF IL-6 and IL-8 were associated with current depression […]".
Convince researchers to adopt a reproducible research workflow
Train new researchers who don't have any other workflow
Scriptability → R
Literate programming → R Markdown
Version control → Git / GitHub
Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do. Donald Knuth in Literate Programming (1983)
2 + 2 factorial(20)
x <- 2 x * 3
Go to RStudio
Note for the future: Each course component you work on (an application exercise, a homework assignment, project, etc.) should be its own repository, and should be fully contained in a folder inside the folder cds101
.
Slides adapted from these course notes by Mine Çetinkaya-Rundel.