Skip to main content
Dryad

EcoCleanR: Enhancing data quality of biogeographic ranges with application for marine invertebrates

Data files

Jan 22, 2026 version files 726.47 KB

Click names to download individual files

Abstract

Published distribution data, while invaluable for understanding species' biogeography, often suffer from limitations such as dated and static representations of ranges, a bias toward latitudinal information, and a lack of resolution in sampling frequency and variation in abundance throughout a species’ distribution. Extensive open-source biodiversity data now allow us to construct biogeographic ranges with more modern observations, which can be useful in conservation, evolution, and ecological studies. However, data quality remains a persistent challenge, hampering data reliability and usability. We introduce EcoCleanR, an R package that integrates existing tools with new functionalities to address data integration and quality assessment through a systematic, step-by-step approach for marine occurrence data. This package enhances the process of identifying and resolving common issues in biodiversity data, including taxonomy and georeferencing errors. It provides: (a) example scripts to guide users, (b) functionalities to flag problematic occurrence records from multiple databases, and (c) outputs that include species-specific distribution ranges and their corresponding environmental conditions, to facilitate accurate biogeographic and ecological analyses.