Multi-Stage Registration: A Better* Way to Pre-Register

This blogpost was written by Cas Goos. Cas his PhD project focuses on studying and enhancing various interventions for improving scientific robustness at journal level, supervised by Michèle Nuijten and Jelte Wicherts.

Pre-registration is a well-known way for researchers to limit their ability to opportunistically analyze their data. I believe an alternative way to register can prove more versatile and realistic: multi-stage registration, where initial registrations can be updated transparently with minimal risk of bias.

The story of Reggie
While registration has been a great step in the right direction, in practice things don’t always go as planned, and the researcher might by necessity have to deviate from the initial registration. But with each deviation, the pre-registration’s guarantee that data were not analyzed opportunistically becomes less and less convincing.

To illustrate, let’s say a researcher Reggie wants to test if:

Mindfulness-based therapy’s effectiveness at reducing burnout is moderated by trait mindfulness.

Wanting to do things right, he pre-registers his hypotheses, experimental design, sample size of 100, and the moderation analysis. However, after Reggie completes his study and submits his article for review, a diligent reviewer notices some key differences from his pre-registration:

  1. The sample size was larger than planned.

  2. Several participants were removed.

  3. Simple effects comparisons were added to further examine the statistically significant interaction effect.

Are the deviations cause for alarm? Will they ruin Reggie’s study as a confirmatory test? Well, that depends. First, on the extent of the deviation. Was the new sample size 104 or 140? Were 4 or 40 participants removed afterwards? Are the simple effects treated as confirmatory or not?

But perhaps even more important is when and why the deviation occurred. Were participants added before or after initial results were observed to be non-significant? Were the participants removed as invalid data patterns observed before any analyses or as “outliers” after initial results were not as expected? And were simple effect analyses included before or after Reggie knew the moderation was statistically significant?

Reggie, wanting to do good, reports on all his deviations. Still, he can kiss the label confirmatory test goodbye. After all, if anyone could deviate as much as desired and declare it afterwards, pre-registrations wouldn’t limit opportunistic data manipulation or analyses. But for Reggie, his additionally collected participants and data removal both happened pre-analyses and he would argue were not opportunistic at all. Unfortunately, people can only take his word for it, as standard pre-registration offers no way to transparently document the when, why, and how of deviations. Multi-stage registration does.

What is multi-stage preregistration?
Multi-stage registration, just like pre-registration, starts with a registration before the study is conducted. The researcher may leave parts open if those are challenging to predict without seeing the data first. The researcher then conducts their research in stages. For example, starting with data collection, then data cleaning, then hypothesis testing, and finally exploratory analyses. After completing a stage, the researcher chooses to revise subsequent based on what they can/should do with their data.

Multi-stage registration is comparable to incremental pre-registration as proposed elsewhere (Lindsay, Simons, Lilienfeld, 2016; Waldron, & Allen, 2022, & Section 2.13 of PCI Guide for authors). However, the focus of incremental pre-reg lies on follow-up studies to an initially pre-registered study. Furthermore, to my knowledge, incremental pre-reg has few guiding principles on how to conduct it.

Regardless, you may still have concerns about multi-stage registration, after all…

Adjusting your analyses based on what you find in the data, that’s exactly what pre-registration is supposed to prevent!

Which is a realistic risk, and that is why the following guidelines are important:

  1. Register the project’s content (writings, code, (meta)data, supplementary materials), before you start a stage.

  2. Document each deviation from the registration in a publicly available logbook as you complete each stage.

  3. If an analysis or test is meant as confirmatory, then the information available before registering the final version of that analysis or test should not be enough to predict the outcome reliably (This is unfortunately a subjective judgement call, and it is therefore important to clarify this to your readers, so they can judge if they agree).

Hence, in our example, Reggie can start registration just as before. Then after collecting 40 more participants than planned, and removing 4 participants, he can make these deviations from his initial stages, document how, why, and when, all before registering his analysis for the final stage. The moderation analysis in that stage remains confirmatory, as long as all deviations from the initial registration up to that point were documented before results were known. The simple-effects analyses however, Reggie included after the moderation results were known and should therefore – as with pre-registration – be reported as exploratory.

Still if we look critically, someone can say that results were not known when deviating even if it’s not true. However, I don’t believe that any registration’s goal is to make it impossible to lie or make a mistake. Instead, more realistic aims for registration are to make not declaring deviations run directly counter to the chosen workflow, sufficiently address the consequences of these deviations, and allow others to check this. In case no deviations occur, pre-registration and multi-stage registration work essentially the same. But when deviations do occur, then pre-registration doesn’t come with a systematic way to document and manage deviations. But by maintaining a logbook of all deviations’ when and why, multi-stage registration makes deviating at the correct stage an integral part of your workflow, with the logbook allowing others to check you.

I am game, so how do I start?
If you are interested, you may still be wondering how to conduct multi-stage registration. Especially in a world where pre-registration (if any registration) is what journals, reviewers, readers, etc. expect.

My advice: when in Rome, do as the Romans do.

If people expect pre-registration as part of a transparent research process, then pre-registration is what they will get. In practice this means that your first-stage registration will function as your pre-registration too. Therefore, you should also make sure to register the first stage on its own on a trusted third-party platform for pre-registration like the OSF, and make sure that it meets relevant pre-registration standards. You can then share your logbook alongside this registration as a supplement. This way you also showcase multi-stage registration within your pre-registration to those interested.

That covers satisfying the pre-registration demand, but how do you set up a multi-stage registration workflow and reap its unique benefits, when registries like the OSF, Zenodo, and clinicaltrials.gov consider updates to the registration an exception rather than the rule?

Currently, I believe we can use version control systems (VCSs) like Git for multi-stage registration. VCSs are often used in software development to maintain multiple (earlier) versions of software, track changes across these versions, document each version, and more. But VCSs are not limited to software, they can work for research projects too. To make it more intuitive think of collaboratively writing an article. But, instead of fumbling through documents like article_final_version.docx, article_final_final_version.docx, article_definitely_final_version.docx and so on, a VCS provides an automated system that stores all the versions of every file in a research project with the changes across version marked (like Word Track changes). VCSs are especially suitable for multi-stage registration because:

  1. Old versions of project files can be stored. Different versions can even be labelled as “releases”, which we can use to label stages.

  2. Changes across versions are tracked.

  3. On platforms like GitHub, researchers can publicly share all their project files in a single repository.

  4. Finally, and most importantly, updating a repository is permanent and timestamped, so it is not possible to go back and change what you had registered after the fact without deleting the entire repository.

Below is a diagram overview of the VCS registration workflow:

Figure 1. Diagram of a multi-stage registration implemented through a VCS. After each contribution to the project, the changed files are uploaded to the VCS servers with all changes tracked, timestamped, and accompanied by a logbook for explanation. The start of each stage is indicated by a release label.

If you want to know how to implement Git in your workflow, you can look at the following page: https://happygitwithr.com/, or use the WORCS R Package, created by a member of our group: https://cjvanlissa.github.io/worcs/ to make reproducible R projects maintained with Git.

Finally, just like with pre-registration, multi-stage registration takes time to learn and implement. But in the long run, not only will it make your research more transparent, tracked changes and a logbook can also be a life saver when you come back to a project.

The asterisk in the title
Throughout this blogpost, I have proposed multi-stage registration as a “better” alternative to pre-registration. But these are uncharted waters. To my knowledge, there is little to no evidence about the effectiveness of multi-stage registration or anything similar. I am however confident that deviations are almost inevitable in practice and that pre-registration provides little aid in navigating deviations within confirmatory research. It remains to be seen whether multi-stage registration will be more helpful, while not succumbing to a host of new problems. We will only learn though if enough of us try multi-stage registration. I know I will.

Acknowledgements
I want to give my thanks to Marcel van Assen and Ben Kretzler for their comments on the initial draft of this blogpost, and to my supervisors Michèle Nuijten and Jelte Wicherts for eliciting my interest in the topic.

References

Lindsay, D. S., Simons, D. J., & Lilienfeld, S. O. (2016). Research Preregistration 101. APS Observer, 29. https://www.psychologicalscience.org/observer/research-preregistration-101

Peer Community In (n.d.) Guide for Authors. Retrieved April 8, 2025, from https://rr.peercommunityin.org/PCIRegisteredReports/help/guide_for_authors#h_22556996329061613309583773

Waldron, S., Allen, C. Not all pre-registrations are equal. Neuropsychopharmacol. 47, 2181–2183 (2022). https://doi-org.tilburguniversity.idm.oclc.org/10.1038/s41386-022-01418-x