Data from: The interaction between the Linnean and Darwinian shortfalls affects our understanding of the evolutionary dynamics driving diversity patterns of New World coralsnakes
Data files
Sep 30, 2024 version files 35.36 MB
-
coralsnakes_data_ITU.xlsx
17.06 KB
-
Creating_metadata.R
5.53 KB
-
data_maps.xlsx
15.77 KB
-
file.gitignore
40 B
-
ITU.R
2.29 KB
-
Linnean-and-Darwinian-Shortfalls.Rproj
205 B
-
Maps.R
3.46 KB
-
metadata_ITU.xml
6.99 KB
-
metadata_MAP.xml
4.84 KB
-
metadata_output.xml
13.26 KB
-
metadata_vetor.xml
5.81 KB
-
Micrurus.tree.tre
3.41 KB
-
README.md
9.11 KB
-
Script_micurus_tese_novo_26_02_2024_10_22.R
6.34 KB
-
Simulation_analysis.R
5.10 KB
-
Simulation_results.csv
31.88 MB
-
sunplin_Micrurus.txt
3.36 MB
-
Vetor.xlsx
18.49 KB
Abstract
Aim: In this study, we sought to understand how the Linnean shortfall (i.e., the lack of knowledge about species taxonomy) interacts with the Darwinian shortfall (i.e., the lack of knowledge about phylogenetic relationships among species), which potentially jeopardizes geographical patterns in estimates of speciation rates.
Location: New World
Taxon: Coralsnakes (Serpentes: Elapidae)
Methods: We created an index of taxonomic uncertainty (ITU) that measures the likelihood of current species being split after undergoing future taxonomic revisions. The ITU was used in simulations where species with higher taxonomic uncertainty had a higher likelihood of having their phylogenetic branches split, generating new hypothetical species along their geographic ranges. We estimated the speciation rates before and after the split of taxonomically uncertain species.
Results: We found that a high number of coralsnake species display substantial taxonomic uncertainty, positively correlated with the latitude of the species' geographical range centroid. The estimated speciation rates based on currently available data have a weak relationship with latitude. However, after incorporating taxonomic uncertainty into the phylogeny, we detect a higher positive correlation between speciation rate and latitude.
Main conclusions: The observed change in speciation rates following the incorporation of taxonomic uncertainty highlights how such uncertainty can undermine the empirical evaluation of geographical patterns in speciation rates, revealing an interaction between the latitudinal taxonomic gradient and the latitudinal diversity gradient. Given that taxonomic changes can alter the number of species recognized as valid over time, our study highlights the need to incorporate taxonomic uncertainty into macroecological and macroevolutionary studies, enhancing the robustness of patterns inferred from these data. --
README
This repository contains all data, scripts, and functions related to the manuscript titled 'The interaction between the Linnean and Darwinian shortfalls affects our understanding of the evolutionary dynamics driving diversity patterns of New World coralsnakes' published in the Journal of Biogeography.
R scripts: contains all R scripts used to run analyses and generate figures for the results of these analyses.
(1) ITU.R: R script used to perform PGL regression, where eight macroecological and taxonomic variables were used to model taxonomists’ responses.
(2) Maps.R: R script used to generate Figure 2.
(3) Script_micurus_tese_novo_26_02_2024_10_22.R: R script used to perform the simulations in which species with a higher ITU had a higher likelihood of having their branches split in the phylogenetic trees.
(4) Simulation_analysis.R: R script used to analyze the results of simulations and to generate Figure 3.
(5) Creating_metadata.R: R script for creating metadata for the output of simulations.
Raw data: contains the raw data used to create the Index of Taxonomic Uncertainty, perform the simulations, and generate the figures.
(1) Coralsnakes_data_ITU.xlsx: data used to perform PGLS regression analysis and to generate the index of taxonomic uncertainty (ITU).
Species: list of the 85 New World coralsnake species
Description date: date of description of each New World coral snake species multiplied by -1. The date of description refers to the date on which the species was first described.
Latitude: latitude of the centroid of each New World coralsnake species in decimal degrees taken as the midpoint of the geographic range.
Longitude: longitude of the centroid of each New World coralsnake species in decimal degrees taken as the midpoint of the geographic range.
Range size: geographic range size of each New World coralsnake species in square kilometers calculated from minimum convex polygons.
Synonyms: number of synonyms for each New World coralsnake species.
Papers: number of scientific papers for each New World coralsnake species in which the species is the focal taxon.
Records: number of occurrence records retrieved from GBIF, scientific papers, and natural history museums for each New World coralsnake species.
DR: the estimated DR statistic for each New World coralsnake species.
Body size: total body length of each New World coralsnake species (from snout to tail tip) in millimeters.
Splitting: average response of taxonomists regarding the likelihood of an accepted New World coralsnake species being split after undergoing a taxonomic revision.
(2) data_maps.xlsx: data used to generate Figure 2
Species: list of the 85 New World coralsnake species.
Latitude: latitude of the centroid of each New World coralsnake species in decimal degrees.
Longitude: longitude of the centroid of each New World coralsnake species in decimal degrees.
ITU: Index of Taxonomic Uncertainty for the 85 New World coralsnakes based on taxonomists’ responses modeled for macroecological and taxonomic variables. It measures the likelihood of an accepted coralsnake species being split into two or more species after undergoing a taxonomic revision.
ITU_intervals: ITU classified in intervals of 0.2.
DR: the DR statistic estimated for each New World coralsnake species.
DR_intervals: DR statistic classified in intervals of 0.2.
(3) Vetor.xlsx: contains data used to perform simulations
Species: list of the 85 New World coralsnake species
ITU: Index of Taxonomic Uncertainty for the 85 New World coralsnakes based on taxonomists’ responses modeled for macroecological and taxonomic variables. It measures the likelihood of an accepted coral snakespecies being split into two or more species after undergoing a taxonomic revision.
LatMin: minimum latitude of the geographic range for each New World coralsnake species.
LongMin: minimum longitude of the geographic range for each New World coralsnake species.
LatMax: maximum latitude of the geographic range for each New World coralsnake species.
LongMax: maximum longitude of the geographic range for each New World coralsnake species.
LatC: latitude of the centroid of each New World coralsnake species in decimal degrees.
LongC: longitude of the centroid of each New World coralsnake species in decimal degrees.
(4) Micrurus.tree.tre: consensus phylogenetic tree used to perform PGLS regression
(5): sunplin_Micrurus.txt: 1000 phylogenetic trees used in the simulations.
Processed data: contains the output of the simulation.
(1) Simulations_results.csv: results of simulations.
R2_lat_antes: Adjusted R² for the linear model (formula: DR ~ latitude) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
lambda_lat_antes: λ Page’s for the linear model (formula: DR ~ latitude) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
alpha_lat_antes: α for the linear model (formula: DR ~ latitude) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
beta_lat_antes: β for the linear model (formula: DR ~ latitude) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
R2_quad_lat_antes: Adjusted R² for the quadratic model (formula: DR ~ latitude + latitude²) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
lambda_quad_lat_antes: λ Page’s for the quadratic model (formula: DR ~ latitude + latitude²) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
alpha_quad_lat_antes: α for the quadratic model (formula: DR ~ latitude + latitude²) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
beta1_quad_lat_antes: β for the linear term of the quadratic model (formula: DR ~ latitude + latitude²) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
beta2_quad_lat_antes: β for the quadratic term of the quadratic model (formula: DR ~ latitude + latitude²) before incorporating the Index of Taxonomic Uncertainty in the simulations and before splitting the species in the phylogeny.
R2_lat_depois: Adjusted R² for the linear model (formula: DR ~ latitude) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
lambda_lat_depois: λ Page’s for the linear model (formula: DR ~ latitude) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
alpha_lat_depois: α for the linear model (formula: DR ~ latitude) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
beta_lat_depois: β for the linear model (formula: DR ~ latitude) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
R2_quad_lat_depois: Adjusted R² for the quadratic model (formula: DR ~ latitude + latitude²) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
lambda_quad_lat_depois: λ Page’s for the quadratic model (formula: DR ~ latitude + latitude²) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
alpha_quad_lat_depois: α for the quadratic model (formula: DR ~ latitude + latitude²) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
beta1_quad_lat_depois: β for the linear term of the quadratic model (formula: DR ~ latitude + latitude²) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
beta2_quad_lat_depois: β for the quadratic term of the quadratic model (formula: DR ~ latitude + latitude²) after incorporating the Index of Taxonomic Uncertainty in the simulations and after splitting the species in the phylogeny.
split: the number of splits in each simulation.
index_run: the number of runs. For each of the 1000 phylogenetic trees, 100 runs were performed.
index_tree: the number of the phylogenetic tree. There are 1000 phylogenetic trees.
Authors
Lívia Estéfane Fernandes Frateles, Guilherme Rogie Gonçalves Tavares, Gabriel Nakamura, Nelson Jorge da Silva Jr., Levi Carina Terribile, José Alexandre F. Diniz-Filho
Methods
Data
List of accepted species and their description years: New World coralsnake accepted species and description years were obtained from Silva Jr. et al. (2021a).
Maximum body size: The maximum body size was obtained from Feldman et al. (2015), Silva Jr. et al. (2021b), and the specialized literature.
Number of synonyms: The number of synonyms was obtained from the Reptile Database (Uetz et al., 2022) and Silva Jr. et al. (2021b).
Published papers: The number of published papers for each species was obtained from searches in Web of Science and Scopus using the accepted species name and its synonyms.
Taxonomic uncertainty: We distributed a questionnaire to taxonomists specializing in coralsnakes and asked them what the likelihood was of each species undergoing a splitting. Taxonomists could answer with “very unlikely,” “unlikely,” “likely,” or “very likely.” We assigned a value range from 1 (very unlikely) to 4 (very likely) and obtained the average of the taxonomists’ answers for each species.
Macroecological variables: Occurrence records were gathered from the Global Biodiversity Information Facility (GBIF, https://www.gbif.org/), scientific literature, and vouchers from museums and biological collections. Records retrieved from GBIF underwent a cleaning process. We calculated the number of occurrence records for each species. We used the occurrence records to generate the geographic range for each species using minimum convex polygons and calculated the range size in square kilometers. We also obtained the centroid for each species.
Phylogenetic tree: The phylogeny was obtained from Tonini et al. (2016). Species in Tonini’s phylogeny that were not considered valid by our list of accepted species were removed, and accepted species not present in Tonini’s phylogeny were included based on taxonomy. We explored uncertainty in branch lengths by obtaining one thousand phylogenetic trees.
Data analysis
We calculated the speciation rate using the DR statistic proposed by Jetz et al. (2012) for each species.
We performed a Phylogenetic Generalized Least Squares (PGLS) where the eight variables mentioned above (year of description, number of occurrence records, number of papers, DR, number of synonyms, latitude of the centroid, body size, and range size) were used to model the taxonomists’ responses. The estimated values of the model were scaled to vary between 0 and 1 and were used as our Index of Taxonomic Uncertainty (ITU). Values closer to 1 indicate high taxonomic uncertainty (i.e., high likelihood of species splitting), while values closer to 0 indicate low taxonomic uncertainty (i.e., low likelihood of species splitting).
ITU was used in simulations where species with high ITU had a higher likelihood of undergoing splitting, generating new sister species in phylogenetic trees. For each of the 1,000 phylogenetic trees, 100 new phylogenies were generated. To estimate the effect of latitude on speciation patterns, we performed two PGLS models. The first model considered the DR statistic as a function of latitude (formula: DR ~ latitude), while the second model incorporated a quadratic term to account for non-linear trends in speciation patterns (formula: DR ~ latitude + latitude²).