The blog post outlines the issue of combining the results of a replication study with the results of an originally published experiment, as well as describing van Aert and van Assen's hybrid meta-analysis technique that addresses this issue.
Saturday June 30, Michèle was interviewed about her dissertation for the Dutch radio show “Dr Kelder & Co”, for NPO Radio 1. The main takeaways: scientists are also just people, psychology is heading into the right direction, and trains don’t always do what you want.
Listen to the whole interview (in Dutch) here.
Recently, our three team members Amir M. Abdol, Leonie van Grootel, and Michèle Nuijten successfully defended their dissertations and obtained a PhD.
On May 8th, 2018, Amir defended his thesis "Reverse Engineering Spatiotemporal Gene Regulatory Networks of the Nematostella vectensis". In his thesis, he analyzed different type of gene expressions datasets using machine learning algorithms as well as studying and developing new models of gene regulatory network by combining stochastic and deterministic processes.
On May 25th, Leonie defended her thesis “Where No Reviewer Has Gone Before: Exploring the Potential of Mixed Studies Reviewing”. In her thesis, she aimed to further develop ways to include both quantitative and qualitative evidence in a mixed studies review. She recommends that the assumed distinction between quantitative and qualitative evidence in reviewing should be abandoned and that the type of model that is analyzed should be input for the chosen review method instead of the type of data.
On May 30th, Michèle defended her thesis “Research on Research: A Meta-Scientific Study of Problems and Solutions in Psychological Science”. Her thesis consisted of two related research lines. In the first part, she investigated statistical reporting inconsistencies: p-values that don’t match their accompanying test statistic and degrees of freedom. In the second part, she focused on publication bias and related problems, and how these can lead to overestimated effects. Her suggested solution for the observed problems is to invest more in meta-science.
Title: Multistep estimators of the between‐study variance: The relationship with the Paule‐Mandel estimator
Authors: Robbie C. M. van Aert & Dan Jackson
Published in: Statistics in Medicine
A wide variety of estimators of the between‐study variance are available in random‐effects meta‐analysis. Many, but not all, of these estimators are based on the method of moments. The DerSimonian‐Laird estimator is widely used in applications, but the Paule‐Mandel estimator is an alternative that is now recommended. Recently, DerSimonian and Kacker have developed two‐step moment‐based estimators of the between‐study variance. We extend these two‐step estimators so that multiple (more than two) steps are used. We establish the surprising result that the multistep estimator tends towards the Paule‐Mandel estimator as the number of steps becomes large. Hence, the iterative scheme underlying our new multistep estimator provides a hitherto unknown relationship between two‐step estimators and Paule‐Mandel estimator. Our analysis suggests that two‐step estimators are not necessarily distinct estimators in their own right; instead, they are quantities that are closely related to the usual iterative scheme that is used to calculate the Paule‐Mandel estimate. The relationship that we establish between the multistep and Paule‐Mandel estimator is another justification for the use of the latter estimator. Two‐step and multistep estimators are perhaps best conceptualized as approximate Paule‐Mandel estimators.
This week the Guardian's Science Weekly podcast focuses on statistical malpractice and fraud in science. Michèle talks about the role of statcheck in detecting statistical inconsistencies, and discusses the causes and implications of seemingly innocent rounding errors. This podcast also offers fascinating insights from consultant anasesthetist John Carlisle about the detection of data fabrication, and president of the Royal Statistical Society David Spiegelhalter about the dangers of statistical malpractice.
We are honored to announce that Michèle Nuijten was awarded a $20,000 methods grant from the Campbell Collaboration, together with meta-analysis expert Joshua R. Polanin. They were awarded the grant for the project “Verifying the Accuracy of Statistical Significance Testing in Campbell Collaboration Systematic Reviews Through the Use of the R Package statcheck”. The grant is part of the Campbell Collaboration’s program to supporting innovative methods development in order to improve the quality of systematic reviews. For more information about the grant and the three other recipients, see their website here.
Title: Bayesian evaluation of effect size after replicating an original study
Authors: Robbie C. M. van Aert & Marcel A. L. M. van Assen
Published in: PLOS One
The vast majority of published results in the literature is statistically significant, which raises concerns about their reliability. The Reproducibility Project Psychology (RPP) and Experimental Economics Replication Project (EE-RP) both replicated a large number of published studies in psychology and economics. The original study and replication were statistically significant in 36.1% in RPP and 68.8% in EE-RP suggesting many null effects among the replicated studies.
However, evidence in favor of the null hypothesis cannot be examined with null hypothesis significance testing. We developed a Bayesian meta-analysis method called snapshot hybrid that is easy to use and understand and quantifies the amount of evidence in favor of a zero, small, medium and large effect. The method computes posterior model probabilities for a zero, small, medium, and large effect and adjusts for publication bias by taking into account that the original study is statistically significant.
We first analytically approximate the methods performance, and demonstrate the necessity to control for the original study’s significance to enable the accumulation of evidence for a true zero effect. Then we applied the method to the data of RPP and EE-RP, showing that the underlying effect sizes of the included studies in EE-RP are generally larger than in RPP, but that the sample sizes of especially the included studies in RPP are often too small to draw definite conclusions about the true effect size. We also illustrate how snapshot hybrid can be used to determine the required sample size of the replication akin to power analysis in null hypothesis significance testing and present an easy to use web application (https://rvanaert.shinyapps.io/snapshot/) and R code for applying the method.
The long-read in the Guardian today by Stephen Buranyi featured our work at the Meta-Research Center. Specifically, it focuses on the work of Chris Hartgerink and Marcel van Assen on detecting fabricated data, and how the development and use of statcheck played a role in their research. Read the full article here.
Jelte Wicherts has been awarded a prestigious 2 million euro Consolidator Grant from the European Research Council (ERC). With the money the meta-research group will be expanded with two postdocs and two PhD students.
The project is entitled IMPROVE: Innovative Methods for Psychology: Reproducible, Open, Valid, and Efficient and will start in the second half of 2017.
A new manuscript by the Meta Research group is in press at Accountability in Research; primary author Coosje Veldkamp. The final paper will be available Open Access, but in the meantime find the abstract below and the postprint on PsyArxiv. Abstract:
Do lay people and scientists themselves recognize that scientists are human and therefore prone to human fallibilities such as error, bias, and even dishonesty? In a series of three experimental studies and one correlational study (total N = 3,278) we found that the ‘storybook image of the scientist’ is pervasive: American lay people and scientists from over 60 countries attributed considerably more objectivity, rationality, open-mindedness, intelligence, integrity, and communality to scientists than other highly-educated people. Moreover, scientists perceived even larger differences than lay people did. Some groups of scientists also differentiated between different categories of scientists: established scientists attributed higher levels of the scientific traits to established scientists than to early-career scientists and PhD students, and higher levels to PhD students than to early-career scientists. Female scientists attributed considerably higher levels of the scientific traits to female scientists than to male scientists. A strong belief in the storybook image and the (human) tendency to attribute higher levels of desirable traits to people in one’s own group than to people in other groups may decrease scientists’ willingness to adopt recently proposed practices to reduce error, bias and dishonesty in science.
Michèle Nuijten and Sacha Epskamp are two of the nine winners of the 2016 Leamer-Rosenthal prize for Open Social Science for their work on statcheck. This prize is an initiative of the Berkeley Initiative for Transparency in the Social Sciences (BITSS), and comes with a prize of $10,000. They will receive their prize at the 2016 BITSS annual meeting, along with seven other researchers and educators.
Read more here.
Lately there has been quite some media attention for statcheck. In a piece in Nature, Monya Baker has written a thorough and nuanced overview of statcheck and the PubPeer project of Chris Hartgerink, in which he scanned 50,000 papers and posted the statcheck results on the online forum PubPeer. In the Nature editorial this type of post-publication peer review is discussed. Some other interesting coverage of statcheck can be found here:
- Buranyi, S. (2016). Scientists are worried about `peer review by algorithm’. Motherboard (VICE). URL
- Resnick, B. (2016). A bot crawled thousands of studies looking for simple math errors. The results are concerning. Vox. URL
- Kershner, K. (2016). Statcheck: when bots `correct’ academics. How Stuff Works. URL
- Keulemans, M. (2016). Worden sociale wetenschappen geterroriseerd door jonge onderzoekers?: Oorlog onder psychologen. De Volkskrant. URL
Our team member Marjan Bakker has just published a paper in Psychological Science, together with Chris Hartgerink, Jelte Wicherts and Han van der Maas. The abstract: Many psychology studies are statistically underpowered. In part, this may be because many researchers rely on intuition, rules of thumb, and prior practice (along with practical considerations) to determine the number of subjects to test. In Study 1, we surveyed 291 published research psychologists and found large discrepancies between their reports of their preferred amount of power and the actual power of their studies (calculated from their reported typical cell size, typical effect size, and acceptable alpha). Furthermore, in Study 2, 89% of the 214 respondents overestimated the power of specific research designs with a small expected effect size, and 95% underestimated the sample size needed to obtain .80 power for detecting a small effect. Neither researchers’ experience nor their knowledge predicted the bias in their self-reported power intuitions. Because many respondents reported that they based their sample sizes on rules of thumb or common practice in the field, we recommend that researchers conduct and report formal power analyses for their studies.
The paper is available here (Open Access).
The Meta-Research group was well represented at the APS conference in Chicago. As a recap, we have shared all our slides. Feel free to view them and let us know if you have any questions or suggestions! Where applicable, Open Science Framework links are included, which makes the presentations citable as well as preserves them.
The Psychology of Statistics and the Statistics of Psychology
- Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study Marcel van Assen (https://osf.io/58xqt/)
- Flawed Intuitions about power Marjan Bakker (https://osf.io/dztjs)
Honesty and Trust in Psychology Research
- The Storybook Image of the Scientist Coosje Veldkamp
- Why do so many researchers misreport p-values? Jelte Wicherts
- How do researchers fabricate data and how to detect fabrication? Chris Hartgerink (https://osf.io/ucfpv/)
How to Deal with Publication Bias in Psychology? Illustrations and Recommendations
To be added Paulette Flore
- Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods Hilde Augusteijn
- Publication Bias in IQ Research Michele Nuijten
- Conducting meta-analyses based on p-values: Reservations and recommendations for applying p-uniform and p-curve Robbie van Aert (https://osf.io/8rtmz/)
Our team member Robbie van Aert recently got his paper accepted for publication in Perspectives on Psychological Science, together with Jelte Wicherts and Marcel van Assen. The abstract: Because evidence of publication bias in psychology is overwhelming, it is important to develop techniques that correct meta-analytic estimates for publication bias. Van Assen, Van Aert, and Wicherts (2015) and Simonsohn, Nelson, and Simmons (2014a) developed p-uniform and p-curve, respectively. The methodology on which these methods are based has great promise for providing accurate meta-analytic estimates in the presence of publication bias. However, we show that in some situations p-curve behaves erratically while p-uniform may yield implausible negative effect size estimates. Moreover, we show that (and explain why) p-curve and p-uniform overestimate effect size under moderate to large heterogeneity, and may yield unpredictable bias when researchers employ p-hacking. We offer hands-on recommendations on applying and interpreting results of meta-analysis in general and p-uniform and p-curve in particular. Both methods as well as traditional methods are applied to a meta-analysis on the effect of weight on judgments of importance. We offer guidance for applying p-uniform or p-curve using R and a user-friendly web application for applying p-uniform (https://rvanaert.shinyapps.io/p-uniform).
An interesting read for anyone using these methods or interested in applying these methods! The paper will be published in a special issue on Methods and Practices.