Data from: An automated approach to identifying search terms for systematic reviews using keyword co-occurrence networks
Grames, Eliza M.
Stillman, Andrew N.
Tingley, Morgan W.
Elphick, Chris S.
Published Jul 24, 2019 on Dryad.
Cite this dataset
Grames, Eliza M.; Stillman, Andrew N.; Tingley, Morgan W.; Elphick, Chris S. (2019). Data from: An automated approach to identifying search terms for systematic reviews using keyword co-occurrence networks [Dataset]. Dryad. https://doi.org/10.5061/dryad.n1kv40m
1. Systematic review, meta-analysis, and other forms of evidence synthesis are critical to strengthen the evidence base concerning conservation issues and to answer ecological and evolutionary questions. Synthesis lags behind the pace of scientific publishing, however, due to time and resource costs which partial automation of evidence synthesis tasks could reduce. Additionally, current methods of retrieving evidence for synthesis are susceptible to bias towards studies with which researchers are familiar. In fields that lack standardized terminology encoded in an ontology, including ecology and evolution, research teams can unintentionally exclude articles from the review by omitting synonymous phrases in their search terms. 2. To combat these problems, we developed a quick, objective, reproducible method for generating search strategies that uses text mining and keyword co-occurrence networks to identify the most important terms for a review. The method reduces bias in search strategy development because it does not rely on a predetermined set of articles and can improve search recall by identifying synonymous terms that research teams might otherwise omit. 3. When tested against the search strategies used in published environmental systematic reviews, our method performs as well as the published searches and retrieves gold-standard hits that replicated versions of the original searches do not. Because the method is quasi-automated, the amount of time required to develop a search strategy, conduct searches, and assemble results is reduced from approximately 17-34 hours to under 2 hours. 4. To facilitate use of the method for environmental evidence synthesis, we implemented the method in the R package litsearchr, which also contains a suite of functions to improve efficiency of systematic reviews by automatically deduplicating and assembling results from separate databases.
Comparison of litsearchr performance to other methods
Performance of search strategies developed with quasi-automated methods in comparison to conventional methods. Precision and recall data comes from the original publications for Sarol (2018), Belter et al. (2018) and Hausner et al. (2015). Data for the litsearchr performance comes from the separate file litsearchr precision and recall.xlsx
litsearchr precision and recall
Results retrieved per database per search, used to measure the precision and recall of original, replicated, naive, and litsearchr searches for six reviews from Environmental Evidence.