Skip to main content
Dryad

CZ Software Mentions: A large dataset of software mentions in the biomedical literature - Expanded 2024

Data files

Nov 12, 2024 version files 9.86 GB

Abstract

We release a dataset of software mentions in open access biomedical papers published in bioRxv, medRxiv, or stored in euroPMC. The mentions are extracted with a trained BERT model. The dataset provides sources, context, metadata, software, and links.

This is a continuation of our previous dataset, https://datadryad.org/stash/dataset/doi:10.5061/dryad.6wwpzgn2c, based on an expanded set of papers