Skip to main content
Dryad

Leveraging metrics to drive data sharing at the Science journals

Data files

Dec 24, 2025 version files 863.58 KB

Click names to download individual files

Abstract

For the scientific research published in Science to be accessible, it is important that the data, methods, results, and code are transparently reported and openly shared. Science has policies on material, data, and code sharing that support our goals of transparency and openness. These include that all papers must have a data availability statement and that all data and code must be available in the paper or deposited in a permanent public repository. Exceptions are made—for example, if there are security concerns or to protect personal privacy. In 2024, Science partnered with the company DataSeer to determine the extent to which our research papers share data and code. Dataseer uses natural language Processing (NLP) to measure a number of Open Science Indicators in published articles. The dataset “AAAS Open Science Metrics data 2021 to 2024” provides article-level data for 2680 Science papers published between 2021 and 2024. This includes article metadata, such as the doi and publication date, as well as the first listed country for the first author (obtained from OpenAlex), data on whether data and code were generated and whether and how they were shared, and data on whether the paper was preprinted (based on fuzzy matching of the article title and authors against articles from major preprint servers). We used this dataset to calculate aggregate data for data and code sharing, as well as whether data was shared in a repository. All papers had a data availability statement; 69% of papers shared data in a repository, online, or in a supplementary data table (6% of papers did not generate or share data); and 23% of papers shared code (46% of papers did not generate or share code). We compared the Science aggregate data with publicly available data from the Public Library of Science (PLOS) and the academic publisher Taylor and Francis. Summary data are provided in the file “Summary data Science and comparators.” This file gives the total number of publications for each source, the number sharing data overall (in a repository, online, or in the supplementary material), the number sharing data in a repository, and the number generating and sharing code. Overall data sharing was 69% for Science, 74% for PLOS, and 24% for Taylor and Francis, whereas data shared in a repository was 56% for Science, 26% for PLOS, and 11% for Taylor and Francis. For the papers that generated code, code sharing was at 41% for Science, 29% for PLOS, and 8% for Taylor and Francis. This provides a baseline as we implement policies and processes to further improve data and code sharing.