This week the Guardian’s Science Weekly podcast focuses on statistical malpractice and fraud in science. Michèle talks about the role of statcheck in detecting statistical inconsistencies, and discusses the causes and implications of seemingly innocent rounding errors.
This podcast also offers fascinating insights from consultant anasesthetist John Carlisle about the detection of data fabrication, and president of the Royal Statistical Society David Spiegelhalter about the dangers of statistical malpractice.
We are honored to announce that Michèle Nuijten was awarded a $20,000 methods grant from the Campbell Collaboration, together with meta-analysis expert Joshua R. Polanin. They were awarded the grant for the project “Verifying the Accuracy of Statistical Significance Testing in Campbell Collaboration Systematic Reviews Through the Use of the R Package statcheck”.
The grant is part of the Campbell Collaboration’s program to supporting innovative methods development in order to improve the quality of systematic reviews. For more information about the grant and the three other recipients, see their website here.
Our team members Robbie van Aert and Marcel van Assen has just published a paper in PLoS ONE. The abstract:
The vast majority of published results in the literature is statistically significant, which raises concerns about their reliability. The Reproducibility Project Psychology (RPP) and Experimental Economics Replication Project (EE-RP) both replicated a large number of published studies in psychology and economics. The original study and replication were statistically significant in 36.1% in RPP and 68.8% in EE-RP suggesting many null effects among the replicated studies. However, evidence in favor of the null hypothesis cannot be examined with null hypothesis significance testing. We developed a Bayesian meta-analysis method called snapshot hybrid that is easy to use and understand and quantifies the amount of evidence in favor of a zero, small, medium and large effect. The method computes posterior model probabilities for a zero, small, medium, and large effect and adjusts for publication bias by taking into account that the original study is statistically significant. We first analytically approximate the methods performance, and demonstrate the necessity to control for the original study’s significance to enable the accumulation of evidence for a true zero effect. Then we applied the method to the data of RPP and EE-RP, showing that the underlying effect sizes of the included studies in EE-RP are generally larger than in RPP, but that the sample sizes of especially the included studies in RPP are often too small to draw definite conclusions about the true effect size. We also illustrate how snapshot hybrid can be used to determine the required sample size of the replication akin to power analysis in null hypothesis significance testing and present an easy to use web application (https://rvanaert.shinyapps.io/snapshot/) and R code for applying the method.
The paper is available here: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0175302
The long-read in the Guardian today by Stephen Buranyi featured our work at the Meta-Research Center. Specifically, it focuses on the work of Chris Hartgerink and Marcel van Assen on detecting fabricated data, and how the development and use of statcheck played a role in their research.
Read the full article here.
Jelte Wicherts has been awarded a prestigious 2 million euro Consolidator Grant from the European Research Council (ERC). With the money the meta-research group will be expanded with two postdocs and two PhD students.
The project is entitled IMPROVE: Innovative Methods for Psychology: Reproducible, Open, Valid, and Efficient and will start in the second half of 2017.
A new manuscript by the Meta Research group is in press at Accountability in Research; primary author Coosje Veldkamp. The final paper will be available Open Access, but in the meantime find the abstract below and the postprint on PsyArxiv.
Do lay people and scientists themselves recognize that scientists are human and therefore prone to human fallibilities such as error, bias, and even dishonesty? In a series of three experimental studies and one correlational study (total N = 3,278) we found that the ‘storybook image of the scientist’ is pervasive: American lay people and scientists from over 60 countries attributed considerably more objectivity, rationality, open-mindedness, intelligence, integrity, and communality to scientists than other highly-educated people. Moreover, scientists perceived even larger differences than lay people did. Some groups of scientists also differentiated between different categories of scientists: established scientists attributed higher levels of the scientific traits to established scientists than to early-career scientists and PhD students, and higher levels to PhD students than to early-career scientists. Female scientists attributed considerably higher levels of the scientific traits to female scientists than to male scientists. A strong belief in the storybook image and the (human) tendency to attribute higher levels of desirable traits to people in one’s own group than to people in other groups may decrease scientists’ willingness to adopt recently proposed practices to reduce error, bias and dishonesty in science.
Michèle Nuijten and Sacha Epskamp are two of the nine winners of the 2016 Leamer-Rosenthal prize for Open Social Science for their work on statcheck. This prize is an initiative of the Berkeley Initiative for Transparency in the Social Sciences (BITSS), and comes with a prize of $10,000.
They will receive their prize at the 2016 BITSS annual meeting, along with seven other researchers and educators.
Announcing two of nine winners of the 2016 Leamer-Rosenthal Prizes for Open Social Science: Michèle Nuijten and Sacha Epskamp!
Lately there has been quite some media attention for statcheck. In a piece in Nature, Monya Baker has written a thorough and nuanced overview of statcheck and the PubPeer project of Chris Hartgerink, in which he scanned 50,000 papers and posted the statcheck results on the online forum PubPeer. In the Nature editorial this type of post-publication peer review is discussed.
Some other interesting coverage of statcheck can be found here:
Buranyi, S. (2016). Scientists are worried about `peer review by algorithm’. Motherboard (VICE). URL
Resnick, B. (2016). A bot crawled thousands of studies looking for simple math errors. The results are concerning. Vox. URL
Kershner, K. (2016). Statcheck: when bots `correct’ academics. How Stuff Works. URL
Keulemans, M. (2016). Worden sociale wetenschappen geterroriseerd door jonge onderzoekers?: Oorlog onder psychologen. De Volkskrant. URL
Our team member Marjan Bakker has just published a paper in Psychological Science, together with Chris Hartgerink, Jelte Wicherts and Han van der Maas. The abstract:
Many psychology studies are statistically underpowered. In part, this may be because many researchers rely on intuition, rules of thumb, and prior practice (along with practical considerations) to determine the number of subjects to test. In Study 1, we surveyed 291 published research psychologists and found large discrepancies between their reports of their preferred amount of power and the actual power of their studies (calculated from their reported typical cell size, typical effect size, and acceptable alpha). Furthermore, in Study 2, 89% of the 214 respondents overestimated the power of specific research designs with a small expected effect size, and 95% underestimated the sample size needed to obtain .80 power for detecting a small effect. Neither researchers’ experience nor their knowledge predicted the bias in their self-reported power intuitions. Because many respondents reported that they based their sample sizes on rules of thumb or common practice in the field, we recommend that researchers conduct and report formal power analyses for their studies.
The paper is available here (Open Access).
The Meta-Research group was well represented at the APS conference in Chicago. As a recap, we have shared all our slides. Feel free to view them and let us know if you have any questions or suggestions! Where applicable, Open Science Framework links are included, which makes the presentations citable as well as preserves them.
The Psychology of Statistics and the Statistics of Psychology
Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study
Marcel van Assen (https://osf.io/58xqt/)
Flawed Intuitions about power
Marjan Bakker (https://osf.io/dztjs)
Honesty and Trust in Psychology Research
The Storybook Image of the Scientist
Why do so many researchers misreport p-values?
How do researchers fabricate data and how to detect fabrication?
Chris Hartgerink (https://osf.io/ucfpv/)
How to Deal with Publication Bias in Psychology? Illustrations and Recommendations
To be added
Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods
Publication Bias in IQ Research
Conducting meta-analyses based on p-values: Reservations and recommendations for applying p-uniform and p-curve
Robbie van Aert (https://osf.io/8rtmz/)