Skip to main content
Dryad

Randomized controlled oncology trials with tumor stage inclusion criteria

Data files

Jun 22, 2024 version files 4.56 MB
Dec 04, 2024 version files 4.58 MB

Abstract

Background:

Extracting inclusion and exclusion criteria in a structured, automated fashion remains a challenge to developing better search functionalities or automating systematic reviews of randomized controlled trials in oncology. The question “Did this trial enroll patients with localized disease, metastatic disease, or both?” could be used to narrow down the number of potentially relevant trials when conducting a search.

Dataset collection:

600 randomized controlled trials from high-impact medical journals were classified depending on whether they allowed for the inclusion of patients with localized and/or metastatic disease. The dataset was randomly split into a training/validation and a test set of 500 and 100 trials respectively. However, the sets could be merged to allow for different splits.

Data properties:

Each trial is a row in the csv file. For each trial there is a doi, a publication date, a title, an abstract, the abstract sections (introduction, methods, results, conclusion), several tags associated with the annotation process (text, _input_hash, _task_hash, options, _view_id, config, accept, answer, _timestamp, _annotator_id,_session_id), and the assigned labels (answer).