Skip to main content
Dryad

Data from: Predicting invasion success of cultivated naturalized plants in China

Data files

Jan 03, 2025 version files 94.49 KB

Abstract

Plant invasions pose significant threats to native ecosystems, human health, and global economies. However, the complex and multidimensional nature of factors influencing plant invasions makes it challenging to predict and interpret their invasion success accurately. Using a robust machine learning algorithm, random forest, and an extensive suite of characteristics related to environmental niches, species traits, and propagule pressure, we developed a classification model to predict the invasion success of naturalized cultivated plants in China. Based on the final optimal model, we evaluated the relative importance of individual and grouped variables and their prediction performance. Our study identified key individual variables within each of three groupings: climatic suitability and native range size (environmental niches), phylogenetic distance to the closest native taxon and vegetative propagation mode (species traits), and the number of botanical gardens and provinces where species were cultivated (propagule pressure). Remarkably, when grouped variables were evaluated, the relative importance of grouped variables increased dramatically—by 13.5 to 17.7 times—compared to the cumulative importance of individual variables within a category. However, the relative importance of one category was primarily due to the number of variables within each category rather than its inherent characteristics.

Synthesis and applications. Our findings emphasize the necessity of developing data-driven predictive tools for effective invasion risk assessment using large datasets. We also highlight the importance of grouped variables in enhancing model interpretability. For practical application in China, we recommend prioritizing surveillance of alien plant species with large native ranges and high climatic suitability. Implementing a tiered risk assessment system based on our random forest model can allow for a more effective allocation of resources for monitoring and managing invasive species. Ultimately, interdisciplinary collaboration is crucial for implementing and applying these predictive tools, thereby protecting biodiversity, ecosystem services, and economic interests.