Data from: Ecological and spatial overlap indicate interspecific competition during North American canid radiation
Data files
Mar 14, 2026 version files 8.08 MB
-
01_PBDB_data_manipulation.R
17.88 KB
-
02_Generate_PyRate_input.R
2.14 KB
-
03_Dataset_spatial_prep.R
1.88 KB
-
04_TSTE_tables.R
1.53 KB
-
05_Gap_filling.R
13.14 KB
-
06_Temporal_coexistence.R
3.54 KB
-
07_Spatial_coexistence.R
12.19 KB
-
08_Time_space.R
14.77 KB
-
09_Trait_imputation.R
3.13 KB
-
10_Linear_Discriminant_Analysis.R
2.50 KB
-
11_Temperature.R
895 B
-
12_Competition_metrics.R
17 KB
-
13_Time_series_plot.R
17.20 KB
-
14_PyRateContinuous_results.R
4.82 KB
-
15_PyRateContinuous_plots.R
18.96 KB
-
16_PyRate_Commands.txt
1.18 KB
-
Canidae_trees.tree
4.43 MB
-
canids_body_mass.csv
5.89 KB
-
eco_morph.csv
11.46 KB
-
longevities.RData
125.94 KB
-
nalmas.txt
61 B
-
pbdb_data_Canidae_NA_full.csv
3.35 MB
-
README.md
9.02 KB
-
Training_set.csv
2.82 KB
Abstract
Understanding biodiversity patterns and the processes that generate them are key goals in macroevolutionary studies. Diversity-dependent models of diversification have been used to indirectly infer the relevance of interspecific competition on diversification dynamics. In this study, we develop a new approach that more explicitly incorporates spatial and eco-morphological overlap among species to test how interspecific competition may affect diversification dynamics in deep time. We build different metrics that capture temporal and spatial coexistence, and ecological overlap to test the hypothesis that an increase in the intensity of competition would result in a decrease in speciation and an increase in extinction rate. We test our predictions using the fossil record of North American canids, a group that has been extensively studied and well characterized both ecologically and from a paleontological point of view. We find that interspecific competition only affected diversification dynamics during the early stages of the radiation of canids, resulting in the suppression of speciation rate at the time the clade was expanding in diversity. We find no association between the intensity of the competition and extinction dynamics, nor an association between changes in diversification dynamics and temperature changes. We discuss the relevance of different factors in driving diversification dynamics changes over time and how evaluating the role of interspecific competition using different metrics that better capture the intensity of competition (as opposed to diversity-dependent models) might be a way forward to investigate the role of biotic interactions at deep time.
https://doi.org/10.1093/evolut/qpaf113
Description of the data and file structure
We obtained fossil occurrence data of Canids from Paleobiology Database (https://paleobiodb.org/#/), craniodental measurements, and phylogenetic data from Dryad (see links below) from published papers. Analyses conducted in R software create the following directory structure:
- MAIN: base directory storing data and analyses;
- PBDB: curated Paleobiology Database data;
- PyRate: PyRate version directory;
- PyRate_analyses: inputs and results from PyRate baseline analyses;
- Spatio_temporal_analyses: inputs and results from coexistence metrics;
- Diversity_curves: matrix data of standing diversity;
- Time_coex: matrix data of time coexistence;
- Spatial_coex: matrix data of spatial coexistence at different scales;
- Time_space: time series of coexistence through time and space at different scales;
- Ecomorphological data: craniodental measurements and ecomorphospace analyses;
- morphospace: body mass, diet and inputs and results for trait and LDA analyses;
- competition: time series of competition in different scales;
- Continuous: results from PyRateContinuous analyses;
- time_series: paste all time series to this directory;
- log_files: paste all .log files to this directory.
Input data
Data provided as different files.
Occurrence Data
- pbdb_data_Canidae_NA_full.csv: Full Canidae occurrence dataset downloaded from Paleobiology Database (https://paleobiodb.org/#/) in August 2024.
- A fossil specimen or species occurrence found at a specific location and geological layer.
- This dataset from the Paleobiology Database that records fossil occurrences of animals from the dog family (Canidae) found in North America. Each row in the file represents one fossil occurrence, while the many columns store detailed information about that fossil. The dataset includes the fossil’s taxonomic identification (such as genus or species), the geological time period when the organism lived (for example, during the Miocene epoch), the estimated age in millions of years, and the geographic location where the fossil was discovered, including latitude and longitude. It also contains geological details about the rock layers or formations where the fossil was found and references to the scientific sources that reported the discovery. Overall, the dataset helps researchers study the evolution, geographic distribution, and extinction patterns of ancient dog-like animals over millions of years.
Ecomorphological Data
- Canidae_trees.tree: 1000 phylogenetic trees for Canidae, pruned from Carnivora trees of Faurby et al. (2024; https://doi.org/10.1098/rspb.2024.0473), used for data imputation on species with missing craniodental variables (available at doi.org/10.5061/dryad.76hdr7t48);
- eco_morph.csv: Includes original ecomorphological data for 6 craniodental variables from Slater (2015; https://doi.org/10.1073/pnas.1403666111), Faurby et al. (2021; https://doi.org/10.1111/geb.13369) and Juhn et al. (2024; https://doi.org/10.1017/pab.2024.27), featuring missing data as NA. Used for data imputation (available at doi.org/10.5061/dryad.9qd51, doi.org/10.5061/dryad.fttdz08t5 and doi.org/10.5061/dryad.2fqz612xw);
- canids_body_mass.csv: Log transformed body mass for all 133 species from Faurby et al. (2021; https://doi.org/10.1111/geb.13369) (available at doi.org/10.5061/dryad.fttdz08t5);
- Training_set.csv: Contains ecomorphological data for 6 craniodental variables used in Linear Discriminant Analysis. Comprises 25 extant canid species, nine procyonids, and Ailurus fulgens. Used to categorize canid species into different species categories. Original data taken from Slater (2015; https://doi.org/10.1073/pnas.1403666111) (available at doi.org/10.5061/dryad.9qd51);
R and PyRate scripts, output data: scripts provided to reproduce the analyses and each output
- 01_PBDB_data_manipulation.R: Curatorial work for occurrence data;
- PBDB_NEW_Canidae_NA_simpler_PyRate_high_resol.csv: curated dataset;
- PBDB_NEW_Canidae_NA_simpler_PyRate_high_resol_Species_List.csv: list of species names;
- 02_Generate_PyRate_input.R: Creates input for PyRate from cleaned occurrence data;
- Canidae_N_america.txt: occurrence data as PyRate input data;
- Canidae_N_america_PyRate.py: PyRate replicated randomized datasets;
- Canidae_N_america_TaxonList.txt: species names and status;
- nalmas.txt: text file with the intervals defined for PyRate qShift notation (provided);
- 03_Dataset_spatial_prep.R: Assigns occurrence data to NALMA;
- df_can.Rdata: occurrence data with nalmas;
- 04_TSTE_tables.R: Compile true times of speciation and extinction;
- longevities.RData: object with the true times of speciation and extinction for all replicas (provided; this is the result of the PyRate analysis);
- 05_Gap_filling.R: Estimate spatial distribution for time intervals which species were not sampled;
- df_can_tot.Rdata: final occurrence dataset with spatial data;
- 06_Temporal_coexistence.R: Create the temporal coexistence matrices and time series;
- mat_time_replicas.Rdata: compilation of temporal matrices of coexistence;
- long.alive.Rdata: compilation of sampled species names in each time frame;
- div_curves.Rdata: compilation of sampled diversity in each time frame;
- 07_Spatial_coexistence.R: Create the spatial coexistence matrices and time series;
- pwdist.Rdata: compilation of pairwise spatial distance between species in kilometers;
- mat_reach_tot.Rdata: compilation of spatial coexistence matrices in reach scale;
- mat_site_tot.Rdata: compilation of spatial coexistence matrices in reach scale;
- 08_Time_space.R: Combine temporal and spatial coexistence matrices and creates time series;
- site_final.Rdata: compilation of spatiotemporal coexistence data on site scale;
- reach_final.Rdata: compilation of spatiotemporal coexistence data on reach scale;
- 09_Trait_imputation.R: phylogenetic imputation of craniodental measurements for species with missing data;
- input_values.csv: results for all species;
- fossil_data.csv: results for extinct species;
- 10_Linear_Discriminant_Analysis.R: Perform LDA following Slater 2015;
- LDA_results.csv: LD1, LD2 and inferred diet for all species;
- final_data.csv: LD1, LD2, inferred diet and log transformed body mass for all species;
- 11_Temperature.R: Extract temperature time series from RPANDA R package (Morlon et. al 2016);
- temperature.txt: paleotemperature curve adjusted to 0.1 million year time scale;
- 12_Competition_metrics.R: Create competition metrics time series;
- reach_mnnd.Rdata: compiled mmnd for reach scale;
- reach_mpd.Rdata: compiled mpd for reach scale;
- site_mnnd.Rdata: compiled mnnd for site scale;
- site_mpd.Rdata: compiled mpd for site scale;
- regional_mnnd.Rdata: compiled mnnd for regional scale;
- regional_mpd.Rdata: compiled mpd for regional scale;
- 13_Time_series_plot.R: Compile and plot time series data;
- 14_PyRateContinuous_results.R: Compile results from PyRateContinuous;
- results_table.Rdata: compiled correlation parameters from speciation and extinction rates for all scales;
- This script is customizable depending on the time window used on the PyRateContinuous analyses. We discussed the epoch time scale in the main text and "cenozoic" scale on ESM;
- 15_PyRateContinuous_plots.R: Create figures of PyRateContinuous results;
- 16_PyRate_Commands.txt: Contains commands used in PyRate (https://github.com/dsilvestro/PyRate) and PyRateContinuous analyses. Detailed instructions for PyRate setup and analysis can be found in the PyRate GitHub tutorials (https://github.com/dsilvestro/PyRate/blob/master/tutorials/README.md).
Code/Software
All analyses were conducted in R version 4.2.2 and PyRate version 3.0 (available on https://github.com/dsilvestro/PyRate).
