Skip to main content

Data from: Are replication rates the same across academic fields? community forecasts from the DARPA SCORE program

Cite this dataset

Gordon, Michael et al. (2020). Data from: Are replication rates the same across academic fields? community forecasts from the DARPA SCORE program [Dataset]. Dryad.


The DARPA program “Systematizing Confidence in Open Research and Evidence” (SCORE) aims to generate confidence scores for a large number of research claims from empirical studies in the social and behavioral sciences. The confidence scores will provide a quantitative assessment of how likely a claim will hold up in an independent replication. To create the scores we follow earlier approaches and use prediction markets and surveys to forecast replication outcomes. Based on an initial set of forecasts for the overall replication rate in SCORE and its dependence on the academic discipline and the time of publication, we show that participants expect replication rates to increase over time. Moreover, they expect replication rates to differ between fields, with the highest replication rate in economics (average survey response 58%), and the lowest in Psychology and in Education (average survey response of 42% for both fields). These results reveal insights into the academic community’s views of the replication crisis, including for research fields for which no large-scale replication studies have been undertaken yet.


This dataset was collected via online platform over the course two weeks in August 2019. The participants of the surveys and prediction markets were recruited through blog posts, twitter and emailing lists, primarily aimed at academics, however anyone was free to join. The data was exported from the online platforms and anonymised before processing and analysis.

The datatables were processed into formats more suited to analysis, and columns name changed to be more intuitive. New columns were also appended which provided information, for instance survey question wording was included as well as question code. All ‘test’ surveys (or surveys completed by admins) were removed before analysis.  

Usage notes

In prediction markets, we used the logarithmic scoring rule (Hanson 2005) with base 2 and a liquidity parameter of b = 100. Participants received an initial endowment of 100 points to trade with.

Within the demography survey, empty values are indicated with NA values. The fields of interest question asks about ‘sociology and criminology separately however for the other surveys and prediction markets these fields are combined. Therefore combing these fields of interest will make them consistent with the rest of the project.

The user_id’s provided in the prediction market and survey data are consistent across all datasets.

A codebook is provided for explanations of column names



Defense Advanced Research Projects Agency, Award: N66001-19-C-4014