New Paper by Robbie van Aert

Title: Multistep estimators of the between‐study variance: The relationship with the Paule‐Mandel estimator

Authors: Robbie C. M. van Aert & Dan Jackson

Published in: Statistics in Medicine


A wide variety of estimators of the between‐study variance are available in random‐effects meta‐analysis. Many, but not all, of these estimators are based on the method of moments. The DerSimonian‐Laird estimator is widely used in applications, but the Paule‐Mandel estimator is an alternative that is now recommended. Recently, DerSimonian and Kacker have developed two‐step moment‐based estimators of the between‐study variance. We extend these two‐step estimators so that multiple (more than two) steps are used. We establish the surprising result that the multistep estimator tends towards the Paule‐Mandel estimator as the number of steps becomes large. Hence, the iterative scheme underlying our new multistep estimator provides a hitherto unknown relationship between two‐step estimators and Paule‐Mandel estimator. Our analysis suggests that two‐step estimators are not necessarily distinct estimators in their own right; instead, they are quantities that are closely related to the usual iterative scheme that is used to calculate the Paule‐Mandel estimate. The relationship that we establish between the multistep and Paule‐Mandel estimator is another justification for the use of the latter estimator. Two‐step and multistep estimators are perhaps best conceptualized as approximate Paule‐Mandel estimators.


"statcheck" in the Guardian's Weekly Science Podcast

This week the Guardian's Science Weekly podcast focuses on statistical malpractice and fraud in science. Michèle talks about the role of statcheck in detecting statistical inconsistencies, and discusses the causes and implications of seemingly innocent rounding errors. This podcast also offers fascinating insights from consultant anasesthetist John Carlisle about the detection of data fabrication, and president of the Royal Statistical Society David Spiegelhalter about the dangers of statistical malpractice.

Awarded a Campbell Methods Grant


We are honored to announce that Michèle Nuijten was awarded a $20,000 methods grant from the Campbell Collaboration, together with meta-analysis expert Joshua R. Polanin. They were awarded the grant for the project “Verifying the Accuracy of Statistical Significance Testing in Campbell Collaboration Systematic Reviews Through the Use of the R Package statcheck”. The grant is part of the Campbell Collaboration’s program to supporting innovative methods development in order to improve the quality of systematic reviews. For more information about the grant and the three other recipients, see their website here.

New Paper on “Bayesian evaluation of effect size after replicating an original study”


Title: Bayesian evaluation of effect size after replicating an original study

Authors: Robbie C. M. van Aert & Marcel A. L. M. van Assen

Published in: PLOS One


The vast majority of published results in the literature is statistically significant, which raises concerns about their reliability. The Reproducibility Project Psychology (RPP) and Experimental Economics Replication Project (EE-RP) both replicated a large number of published studies in psychology and economics. The original study and replication were statistically significant in 36.1% in RPP and 68.8% in EE-RP suggesting many null effects among the replicated studies.

However, evidence in favor of the null hypothesis cannot be examined with null hypothesis significance testing. We developed a Bayesian meta-analysis method called snapshot hybrid that is easy to use and understand and quantifies the amount of evidence in favor of a zero, small, medium and large effect. The method computes posterior model probabilities for a zero, small, medium, and large effect and adjusts for publication bias by taking into account that the original study is statistically significant.

We first analytically approximate the methods performance, and demonstrate the necessity to control for the original study’s significance to enable the accumulation of evidence for a true zero effect. Then we applied the method to the data of RPP and EE-RP, showing that the underlying effect sizes of the included studies in EE-RP are generally larger than in RPP, but that the sample sizes of especially the included studies in RPP are often too small to draw definite conclusions about the true effect size. We also illustrate how snapshot hybrid can be used to determine the required sample size of the replication akin to power analysis in null hypothesis significance testing and present an easy to use web application ( and R code for applying the method.


ERC Consolidator Grant for Jelte Wicherts


Jelte Wicherts has been awarded a prestigious 2 million euro Consolidator Grant from the European Research Council (ERC). With the money the meta-research group will be expanded with two postdocs and two PhD students.

The project is entitled IMPROVE: Innovative Methods for Psychology: Reproducible, Open, Valid, and Efficient and will start in the second half of 2017.

Postprint "Who Believes in the Storybook Image of the Scientist?"

A new manuscript by the Meta Research group is in press at Accountability in Research; primary author Coosje Veldkamp. The final paper will be available Open Access, but in the meantime find the abstract below and the postprint on PsyArxiv. Abstract:

Do lay people and scientists themselves recognize that scientists are human and therefore prone to human fallibilities such as error, bias, and even dishonesty? In a series of three experimental studies and one correlational study (total N = 3,278) we found that the ‘storybook image of the scientist’ is pervasive: American lay people and scientists from over 60 countries attributed considerably more objectivity, rationality, open-mindedness, intelligence, integrity, and communality to scientists than other highly-educated people. Moreover, scientists perceived even larger differences than lay people did. Some groups of scientists also differentiated between different categories of scientists: established scientists attributed higher levels of the scientific traits to established scientists than to early-career scientists and PhD students, and higher levels to PhD students than to early-career scientists. Female scientists attributed considerably higher levels of the scientific traits to female scientists than to male scientists. A strong belief in the storybook image and the (human) tendency to attribute higher levels of desirable traits to people in one’s own group than to people in other groups may decrease scientists’ willingness to adopt recently proposed practices to reduce error, bias and dishonesty in science.

Leamer-Rosenthal Prize for statcheck

Michèle Nuijten and Sacha Epskamp are two of the nine winners of the 2016 Leamer-Rosenthal prize for Open Social Science for their work on statcheck. This prize is an initiative of the Berkeley Initiative for Transparency in the Social Sciences (BITSS), and comes with a prize of $10,000. They will receive their prize at the 2016 BITSS annual meeting, along with seven other researchers and educators.

Read more here.


Media Attention for `statcheck`

Lately there has been quite some media attention for statcheck. In a piece in Nature, Monya Baker has written a thorough and nuanced overview of statcheck and the PubPeer project of Chris Hartgerink, in which he scanned 50,000 papers and posted the statcheck results on the online forum PubPeer. In the Nature editorial this type of post-publication peer review is discussed. Some other interesting coverage of statcheck can be found here:

  • Buranyi, S. (2016). Scientists are worried about `peer review by algorithm’. Motherboard (VICE)URL
  • Resnick, B. (2016). A bot crawled thousands of studies looking for simple math errors. The results are concerning. VoxURL
  • Kershner, K. (2016). Statcheck: when bots `correct’ academics. How Stuff Works URL
  • Keulemans, M. (2016). Worden sociale wetenschappen geterroriseerd door jonge onderzoekers?: Oorlog onder psychologen. De Volkskrant. URL

New Paper on Researcher's Intuitions About Statistical Power

Our team member Marjan Bakker has just published a paper in Psychological Science, together with Chris Hartgerink, Jelte Wicherts and Han van der Maas. The abstract: Many psychology studies are statistically underpowered. In part, this may be because many researchers rely on intuition, rules of thumb, and prior practice (along with practical considerations) to determine the number of subjects to test. In Study 1, we surveyed 291 published research psychologists and found large discrepancies between their reports of their preferred amount of power and the actual power of their studies (calculated from their reported typical cell size, typical effect size, and acceptable alpha). Furthermore, in Study 2, 89% of the 214 respondents overestimated the power of specific research designs with a small expected effect size, and 95% underestimated the sample size needed to obtain .80 power for detecting a small effect. Neither researchers’ experience nor their knowledge predicted the bias in their self-reported power intuitions. Because many respondents reported that they based their sample sizes on rules of thumb or common practice in the field, we recommend that researchers conduct and report formal power analyses for their studies.

The paper is available here (Open Access).

APS 2016 Chicago Presentations

The Meta-Research group was well represented at the APS conference in Chicago. As a recap, we have shared all our slides. Feel free to view them and let us know if you have any questions or suggestions! Where applicable, Open Science Framework links are included, which makes the presentations citable as well as preserves them.

The Psychology of Statistics and the Statistics of Psychology

Honesty and Trust in Psychology Research

How to Deal with Publication Bias in Psychology? Illustrations and Recommendations

To be added Paulette Flore

Preprint of a New Paper Comparing p-curve and p-uniform

Our team member Robbie van Aert recently got his paper accepted for publication in Perspectives on Psychological Science, together with Jelte Wicherts and Marcel van Assen. The abstract: Because evidence of publication bias in psychology is overwhelming, it is important to develop techniques that correct meta-analytic estimates for publication bias. Van Assen, Van Aert, and Wicherts (2015) and Simonsohn, Nelson, and Simmons (2014a) developed p-uniform and p-curve, respectively. The methodology on which these methods are based has great promise for providing accurate meta-analytic estimates in the presence of publication bias. However, we show that in some situations p-curve behaves erratically while p-uniform may yield implausible negative effect size estimates. Moreover, we show that (and explain why) p-curve and p-uniform overestimate effect size under moderate to large heterogeneity, and may yield unpredictable bias when researchers employ p-hacking. We offer hands-on recommendations on applying and interpreting results of meta-analysis in general and p-uniform and p-curve in particular. Both methods as well as traditional methods are applied to a meta-analysis on the effect of weight on judgments of importance. We offer guidance for applying p-uniform or p-curve using R and a user-friendly web application for applying p-uniform (

An interesting read for anyone using these methods or interested in applying these methods! The paper will be published in a special issue on Methods and Practices.

Download the preprint here.