Skip to main content
Dryad

Data from: An automated approach to identifying search terms for systematic reviews using keyword co-occurrence networks

Data files

Jul 24, 2019 version files 73.54 KB

Abstract

1. Systematic review, meta-analysis, and other forms of evidence synthesis are critical to strengthen the evidence base concerning conservation issues and to answer ecological and evolutionary questions. Synthesis lags behind the pace of scientific publishing, however, due to time and resource costs which partial automation of evidence synthesis tasks could reduce. Additionally, current methods of retrieving evidence for synthesis are susceptible to bias towards studies with which researchers are familiar. In fields that lack standardized terminology encoded in an ontology, including ecology and evolution, research teams can unintentionally exclude articles from the review by omitting synonymous phrases in their search terms. 2. To combat these problems, we developed a quick, objective, reproducible method for generating search strategies that uses text mining and keyword co-occurrence networks to identify the most important terms for a review. The method reduces bias in search strategy development because it does not rely on a predetermined set of articles and can improve search recall by identifying synonymous terms that research teams might otherwise omit. 3. When tested against the search strategies used in published environmental systematic reviews, our method performs as well as the published searches and retrieves gold-standard hits that replicated versions of the original searches do not. Because the method is quasi-automated, the amount of time required to develop a search strategy, conduct searches, and assemble results is reduced from approximately 17-34 hours to under 2 hours. 4. To facilitate use of the method for environmental evidence synthesis, we implemented the method in the R package litsearchr, which also contains a suite of functions to improve efficiency of systematic reviews by automatically deduplicating and assembling results from separate databases.