Saturday, 16 November 2013

Bringing study pre-registration home to roost

Earlier this year I committed my research group to pre-registering all studies in our recent BBSRC grant, which includes fMRI, TMS and TMS-fMRI studies of human cognitive control. We will also publicly share our raw data and analysis scripts, consistent with the principles of open science. As part of this commitment I’m glad to report that we have just published our first pre-registered study protocol at the Open Science Framework.

For those unfamiliar with study pre-registration, the rationale is simply this: that to prevent different forms of human bias creeping into hypothesis-testing we need to decide before starting our research what our hypotheses are and how we plan to test them. The best way to achieve this is to publicly state the research questions, hypotheses, outcome measures, and planned analyses in advance, accepting that anything we add or change after inspecting our data is by definition exploratory rather than pre-planned.

To many scientists (and non-scientists) this may seem like the bleeding obvious, but the  truth is that the life sciences are suffering a crisis in which research that is purely exploratory and non-hypothesis-driven masquerades as hypothetico-deductive. That’s not to say that confirmatory (hypothesis-driven) research is necessarily worth any more than exploratory (non-hypothesis driven) research. The point is that we need to be able to distinguish one from the other, otherwise we build a false certainty in the theories we produce. Psychology and cognitive neuroscience are woeful at making this distinction clear, in part because they ascribe such a low priority to purely exploratory research.

Pre-registration helps solve a number of specific problems inherent in our publishing culture, including p-hacking (mining data covertly for statistical significance) and HARKing (reinventing hypotheses to predict unexpected results). These practices are common in psychology because it is difficult to publish anything in ‘top journals’ where the main outcome was p >.05 or isn’t based on a clear hypothesis.

Evidence of such practices can be found in the literature and all around us. Just last week at the Society for Neuroscience conference in San Diego, I had at least three conversations where presenters at posters would say something like: “Look at this cool effect. We tested 8 subjects and it looked interesting so we added another 8 and it became significant”. Violation of stopping rules is just one example of how we think like Bayesians while being tied to frequentist statistical methods that don’t allow us to do so. This bad marriage between thought and action endangers our ability to draw unbiased inferences and, without appropriate Type I correction, elevates the rate of false discoveries.

In May, the journal Cortex launched a new format of article that attempts to solve these problems by incentivising pre-registration. Unlike conventional publishing models, Registered Reports are peer reviewed before authors conduct their experiments and the journal offers provisional acceptance of final papers based solely on the proposed protocol. The model at Cortex not only prevents p-hacking and HARKing – it also solves problems caused by low statistical power, lack of data transparency, and publication bias. Similar initiatives have been launched or approved by several other journals, including Perspectives on Psychological Science, Attention Perception & Psychophysics, and Experimental Psychology. I’m glad to say that 10 other journals are currently considering similar formats, and so far no journal to my knowledge has decided against offering pre-registration.

In June, I wrote an open letter to the Guardian with Marcus Munafò and >80 of our colleagues who sit on editorial boards. Together we called for all journals in the life sciences to offer pre-registered article formats. The response to the article was overall neutral or positive, but as expected not everyone agreed. One of the most striking features of the negative responses to pre-registration was how the critics targeted a version of pre-registration we did not propose. For instance, some felt that the Cortex model would prevent publication of serendipitous findings or exploratory analyses (it doesn't), that authors would be “locked” into publishing with Cortex (they aren’t), or that the model we proposed was suggested as mandatory or universal (it is explicitly neither). I would ask those who responded negatively to reconsider the details of the Cortex initiative because we don’t disagree nearly as much as it seems. In regular seminars I give on Registered Reports at Cortex I include a 19-point list of FAQs and response to these points, which you can read here. I will regularly update this link as new FAQs are added.
I believe we are in the early stages of a revolution in the way we do research – one not driven by pre-registration per se, and certainly not by me, but by the combination of converging future-oriented approaches, including emphasis on replication (and replicability), open science, open access publishing, and pre-registration. The pace of evolution in scientific practices has shifted up a gear. Clause 35 of the revised Declaration of Helsinki now explicitly requires some form of study pre-registration for medical research involving human participants. Although much work in psychology and cognitive neuroscience isn’t classed as ‘medical’, many of the major journals that publish basic research also ask authors to adhere to the Declaration, including the Journal of Neuroscience, Cerebral Cortex, and Psychological Science.

The revised Declaration of Helsinki has caused some concern among psychologists, and I should make it clear that those of us promoting pre-registration as a new option for journals had no role in formulating these revised ethical guidelines. However we shouldn’t necessarily see them as a problem. There are many simple and non-bureaucratic ways to pre-register research (such as the OSF), even if the journal-based route is the only to reward authors with advance publication.

One valid point that has been made in this debate is that those of us who are promoting pre-registration should practice what we preach, even when there is no journal option currently available (and for me there isn’t another option because Cortex – where I am section editor – is so far the only cognitive neuroscience journal offering pre-registered articles). Some researchers, such as Marcus Munafò, already pre-register on a routine basis and have done for some time. For my group it is newer venture, and here is our first attempt. Our protocol describes an fMRI experiment of response inhibition and action updating that forms the jumping off point for several upcoming studies involving TMS and concurrent TMS-fMRI. We are registering this protocol prior to data collection. All comments and criticisms are welcome.

Writing a protocol for an fMRI experiment was challenging because it required us to nail down in advance our decisions and contingencies at all stages of the analysis. The sheer number of seemingly arbitrary decisions also reinforced my belief that many, if not most, fMRI studies are contaminated by bias (whether conscious or unconscious) and undisclosed analytic flexibility. I found pre-registration rewarding because it helped us refine exactly how we would go about answering our research questions. There is much to be said for taking the time to prepare science carefully, and time spent now will be time saved when it comes to the analysis phase.

Most of the work in our first pre-registration was undertaken by two extremely talented young scientists in my team: PhD student Leah Maizey and post-doctoral researcher Chris Allen. Leah and Chris deserve much praise for having the courage and conviction to take on this initiative while many of our senior colleagues 'wait and see'.

Pre-registration is now a normal part of the culture in my lab and I hope you’ll consider making it a part of yours too. Embracing the hypothetico-deductive method helps protect the outcome of hypothesis-driven research from our inherent weaknesses as human practitioners. It also prompts us to consider deeper questions. As a community we need to reflect on what sort of scientific culture we want future generations to inherit. And when we look at the current status quo of questionable research practices, it leads us to ask one simple question: Who are we serving, us or them?


  1. Chris. Thanks for writing this and thanks to you, Marcus and others for pushing this forward. The discussions around pre-registration have been a real eye-opener for me.

    The difficulty that I and I think others have with preregistration is that we're often not sure *exactly* what it is we're looking for before we start.

    And this I think is your point.

    It means that anything we do find should be considered exploratory, even if it was generally in line with predictions.

    It also means that the results would need to be replicated before we could have confidence in them. But at least the second time around we'd be able to pre-register the report, saying exactly what we were looking for.

    If nothing else, the movement towards preregistration should hopefully give greater status to replication attempts. Because these would often be the first time the study was conducted under preregistration conditions.

    But this is going to require a culture change, particularly given the importance (career-wise) attached to publishing in the right journals - ie those that prioritise novelty above anything else. The Cortex initiative is a really important step in this direction.

    Hopefully we get to a situation where, instead of asking "but was it peer reviewed" before we believe something, we ask "but was it preregistered?"

    1. Thanks Jon. I was chatting with Tom Johnstone (Reading) at SFN and he argued that one beneficial side effect of Registered Reports could be that it pushes the community to place higher value in overt exploratory research. I found myself nodding in agreement -- non-hypothesis driven research has great value in pushing back the frontiers. The problem is that our community has a pre-existing bias against it which forces researchers to pretend their research is confirmatory when it isn't, and that they are more certain of their results than they really need to be (or should be).

      This in turn gets me wondering whether we should be also launching an "Exploratory Reports" format: articles with only general questions (and with no hypotheses). They would report potential new phenomena or findings of interest and would provide the perfect material for later Registered Reports. Unless I'm mistaken, is quite similar to your original idea of Experiments and Observations?

    2. Yes, pretty much. I was thinking of Observations more for things that you find that you weren't setting out to find. But purely exploratory work would also fit in that category.

  2. One thing I noticed when working with Prof. Dienes was that fundamentally, the problem with stopping rules isn't Bayesian vs. Frequentist, but Parameter Estimation vs. Hypothesis Testing. If I get it right, a badly-formulated H0 will be sensitive to "researcher DoF stopping" with certain Bayesian hypothesis tests, too; but a sensible stopping rule in a frequentist framework won't be. If your 0 is "exactly 0" and you compare it to "any other value than 0", you're, fundamentally, not doing sensible science (note that barely any Bayesian out there right now would use such a badly-designed H0). In contrast, a stopping rule such as "once the CI is narrow enough to fit inside a certain window" is not biased towards rejecting H0. The problem is one of circularity - if your stopping rule implicitly favours one outcome over the other (for example, in your standard frequentist test, because you may only ever reach outcome 1, never outcome 2).

    In sum, as a Popperian, I think parameter estimation over hypothesis testing is the more important Battle, rather than Bayes over Frequentist. Yes, a Credible Interval may be somewhat more intuitive than a Confidence Interval, but either is better than any hypothesis test in most cases.

    1. Hi Jona! As you say, with both Bayes and frequentist methods one can use "inference by intervals": Divide the possible DV values into two regions, a null region and an alternative region; if the confidence/credibility interval lies entirely in one of the regions, accept the corresponding hypothesis (null region hypothesis or alternative hypothesis). (See: Normal two-tailed significance testing is a degenerate version of this where the null region is a point. In this case, the alternative hypothesis (which consists of all values except an infinitesimally small point) is unfalsifiable, so by Popperian standards is not even science. Things get better when the null point is extended to a region. BUT one in general still has to strictly respect stopping rules in order to avoid large error probabilities, and this is true for both Bayesian and orthodox intervals. One can add a further restriction: make sure the confidence/credibility interval is not much larger than the null region; then one can check after every data point if one likes and still have good error probabilities.
      Bayes has more up its sleeve than inference by intervals. One can test a point null using Bayes factors. In this case, a point null becomes falsifiable (given a convention like: reject the null if B > 3). Point nulls may never be literally true, but they can be so close to the truth as to be a fine approximation. (Baguley in his 2012 book on p 369 gives the example of an ESP experiment with 28,000 subjects yielding a confidence interval [.496, .502], with a nominal point chance baseline of .5.) Thus, point nulls, tested with Bayes factors, have their place too. Cheers! Zoltan