Environmental heterogeneity, rather than stability, explains spider assemblage differences between ecosystems
Data files
Aug 16, 2024 version files 493.90 KB
-
Climatic_data.xlsx
23.69 KB
-
input_GDM.csv
413.88 KB
-
Nst_data.csv
19.43 KB
-
README.md
7.22 KB
-
Species_matrix.csv
29.69 KB
Abstract
Open ecosystems (e.g., grasslands, prairies, shrublands) tend to be ecologically less stable than closed ones (i.e., forests) and encompass higher spatial heterogeneity in terms of environmental diversity. Such differences are expected to differentially constrain the diversity and structure of the communities that inhabit each of them, but identifying the specific processes driving contrasting biodiversity patterns between open and closed systems is challenging. In order to understand how environmental variability might structure spider assemblages, both between and within open and closed ecosystems, we implement a high throughput multiplex barcode sequencing approach to generate a dataset for 8585 specimens representing 168 species, across open ecosystems within the Canary Islands. Combining these with spider sequences from closed ecosystems within the same islands, we show that spider communities in open ecosystems show higher species richness, higher beta diversity, and higher proportions of rare species but proportionately lower numbers of endemic species than communities in closed ecosystems. We furthermore assess if environmental heterogeneity and habitat stability are the major drivers of such differences by assessing spatial genetic structuring and the influence of bioclimatic variables. Our results point to environmental heterogeneity rather than stability as a major driver of spatial patterns between open and closed ecosystems.
README: Environmental heterogeneity, rather than stability, explains spider assemblage differences between ecosystems
This README.txt file was generated on 2024-08-21 by Daniel Suárez
GENERAL INFORMATION
- Author Information
- Corresponding Investigator Name: Dr. Brent Charles Emerson Institution: IPNA-CSIC, San Cristóbal de La Laguna, Canary Islands, Spain Email: bemerson@ipna.csic.es
- Co-investigator 1 Name: Daniel Suárez Institution: IPNA-CSIC, San Cristóbal de La Laguna, Canary Islands, Spain
- Co-investigador 2 Name: Paula Arribas Institution: IPNA-CSIC, San Cristóbal de La Laguna, Canary Islands, Spain
- Co-investigator 4 Name: Amrita Srivathsan Institution: Museum für Naturkunde, Leibniz Institute for Evolution and Biodiversity Science, Centre for Integrative Biodiversity Discovery, Berlin, Germany
- Co-investigator 5 Name: Rudolf Meier Institution: Museum für Naturkunde, Leibniz Institute for Evolution and Biodiversity Science, Centre for Integrative Biodiversity Discovery, Berlin, Germany
- Date of data collection: 2012-2021
- Geographic location of data collection: Canary Islands, Spain
- Recommended citation for this dataset: Suárez et al. (2024). Data from: Environmental heterogeneity, rather than stability, explains spider assemblage differences between ecosystems. Dryad Data Repository.
DATA & FILE OVERVIEW
These datasets were generated to investigate how environmental variability might 8 structure spider assemblages, both between and within open and closed ecosystems in a community of spiders
File List:
File 1 Name: Climatic_data.xlsx File Description: database bioclimatic data for the selected sites.
File 2 Name: input_GDM.csv File Description: database with bioclimatic and faunistic data needed for Generalised Dissimilarity Models.
File 3 Name: Nst_data.csv File Description: a database including fixation indexes (Nst) for the studied spider species.
File 4 Name: Species_matrix.csv File Description: a database including presence-absence data for the spider species.
DATA-SPECIFIC INFORMATION FOR: Climatic_data.xlsx
- Number of variables: 24
- Number of cases/rows: 57
- Variables List:
- island: code of each island (TF - Tenerife; LG - La Gomera; LP - La Palma; EH - El Hierro)
- ecosystem: type of each ecosystem (closed or open)
- site: code of each site
- lon: longitude (geographical coordinates)
- lat: latitude (geographical coordinates)
- BIO19: precipitation of the coldest quarter (kg/m2)
- BIO18: precipitation of the warmest quarter (kg/m2)
- BIO17: precipitation of the driest quarter (kg/m2)
- BIO16: precipitation of the wettest quarter (kg/m2)
- BIO15: precipitation seasonality (coefficient of variation) (kg/m2)
- BIO14: precipitation of the driest month (kg/m2)
- BIO13: precipitation of the wettest month (kg/m2)
- BIO12: annual precipitation (kg/m2)
- BIO11: mean temperature of the coldest quarter (ºC)
- BIO10: mean temperature of the warmest quarter (ºC)
- BIO9: mean temperature of the warmest quarter (ºC)
- BIO8: mean temperature of the wettest quarter (ºC)
- BIO7: temperature annual range (BIO5-BIO6) (ºC)
- BIO6: min temperature of the coldest month (ºC)
- BIO5: max temperature of the warmest month (ºC)
- BIO4: temperature seasonality (standard deviation * 100) (ºC/100)
- BIO3: isothermality ((BIO2/BI07)*100) (ºC)
- BIO2: mean diurnal range (mean of monthly (max temp - min temp)) (ºC)
- BIO1: annual mean temperature (ºC)
DATA-SPECIFIC INFORMATION FOR: input_GDM.csv
- Number of variables: 23
- Number of cases/rows: 1960
- Variables List:
- site: code of each site
- PBS: probable biological species (PBS) code
- lon: longitude (geographical coordinates)
- lat: latitude (geographical coordinates)
- BIO19: precipitation of the coldest quarter (kg/m2)
- BIO18: precipitation of the warmest quarter (kg/m2)
- BIO17: precipitation of the driest quarter (kg/m2)
- BIO16: precipitation of the wettest quarter (kg/m2)
- BIO15: precipitation seasonality (coefficient of variation) (kg/m2)
- BIO14: precipitation of the driest month (kg/m2)
- BIO13: precipitation of the wettest month (kg/m2)
- BIO12: annual precipitation (kg/m2)
- BIO11: mean temperature of the coldest quarter (ºC)
- BIO10: mean temperature of the warmest quarter (ºC)
- BIO9: mean temperature of the warmest quarter (ºC)
- BIO8: mean temperature of the wettest quarter (ºC)
- BIO7: temperature annual range (BIO5-BIO6) (ºC)
- BIO6: min temperature of the coldest month (ºC)
- BIO5: max temperature of the warmest month (ºC)
- BIO4: temperature seasonality (standard deviation * 100) (ºC/100)
- BIO3: isothermality ((BIO2/BI07)*100) (ºC)
- BIO2: mean diurnal range (mean of monthly (max temp - min temp)) (ºC)
- BIO1: annual mean temperature (ºC)
DATA-SPECIFIC INFORMATION FOR: Nst_data.csv 1. Number of variables: 25
- Number of cases/rows: 149
- Variables List:
- Ecosystem: type of each ecosystem (closed or open)
- PBS: Probable Biological Species (PBS) code
- Taxonomy: PBS taxonomical classification
- Dispersal: dispersal ability (NoBallooning; poor dispersal ability - Ballooning; good dispersal ability)
- Nseq_AR: number of sequences at the archipelago scale
- Gst_AR: Gst value at the archipelago scale
- Gst_pval_AR: p-value for Gst value at the archipelago scale
- Nst_AR: Nst value at the archipelago scale
- Nst_pval_AR: p-value for Nst value at the archipelago scale
- Gst_TF: Gst value at the within island (Tenerife) scale
- Gst_pval_TF: p-value for Gst value at the within island (Tenerife) scale
- Nst_TF: Nst value at the within island (Tenerife) scale
- Nst_pval_TF: p-value for Nst value at the within the island (Tenerife) scale
- Gst_LG: Gst value at the within the island (La Gomera) scale
- Gst_pval_LG: p-value for Gst value at the within island (La Gomera) scale
- Nst_LG: Nst value at the within island (La Gomera) scale
- Nst_pval_LG: p-value for Nst value at the within island (La Gomera) scale
- Gst_LP: Gst value at the within the island (La Palma) scale
- Gst_pval_LP: p-value for Gst value at the within the island (La Palma) scale
- Nst_LP: Nst value at the within island (La Palma) scale
- Nst_pval_LP: p-value for Nst value at the within the island (La Palma) scale
- Gst_EH: Gst value at the within island (El Hierro) scale
- Gst_pval_EH: p-value for Gst value at the within island (El Hierro) scale
- Nst_EH: Nst value at the within island (El Hierro) scale
- Nst_pval_EH: p-value for Nst value at the within island (El Hierro) scale
Nst and Gst values range from 0 (no geographic structuring) to 1 (geographic structuring). NA values indicate that Nst and/or Gst were not calculated for a given PBS.
DATA-SPECIFIC INFORMATION FOR: Species_matrix.csv
- Number of variables: 248
- Number of cases/rows: 57
- Variables List:
- site: code of each site
- PBS001-247: remaining rows (PBS001 to PBS247) refers to values of presence (1) or absence (0) in a given site (rows)
Methods
A total of 25 sites were sampled within open shrublands characterised by the presence of species of Euphorbia across the four western islands of the Canary Archipelago. These data were complemented with an existing dataset including 31 closed laurel forest sites across the same islands. Sampled individuals were examined under a stereomicroscope, and provisional taxonomic assignments were made at the species, genus, or family level. Subsequently, all specimens were DNA-barcoded using the multiplex barcoding approach. Community dissimilarity matrices were estimated at the species level across all sites. Matrices were generated using total β diversity (Sørensen index, βsor) and its additive turnover (Simpson index, βsim) and nestedness (βsne) components. To disentangle the contribution of different factors in the βsor within ecosystems, generalised dissimilarity models (GDM) were applied. A matrix of dissimilarity across sites was implemented as the response variables, and geographical distance and bioclimatic variables from the sampling sites were implemented as predictors. Environmental variables (the same 19 bioclimatic variables defined in WorldClim, Fick, and Hijmans 2017) were downloaded at the finest resolution available (100 m) from CanaryClim (Patiño et al. 2023). Correlations between predictor variables were assessed using Pearson's rank correlation coefficient r, establishing r < |0.7| as a threshold to avoid multicollinearity among the bioclimatic variables used as predictors. Models were then fitted, setting 50 permutations to evaluate the models and variable significance (gdm function, gdm package), to estimate variable importance (gdm.varImp function, gdm package), and to perform deviance partitioning (gdm.partition.deviance function, gdm package). Models were conducted for each ecosystem separately, both across and within islands. Finally, the overall environmental heterogeneity and the spatial structure of this environmental heterogeneity were compared between open and closed ecosystems. First, principal component analyses (PCA) (dudi.pca function, ade4 package; Dray and Dufour 2007) were conducted both at the archipelago and the within-island scale using the 19 bioclimatic variables from Patiño et al. (2023) described before. The betadisper function (vegan package; Oksanen et al. 2022) was used to evaluate the multivariate homogeneity of group dispersions (mean distances to centroid) between ecosystems. To test for temporal stability between ecosystems, generalised linear models (glm function, stats package) were fitted with a Gaussian error structure implementing as dependent variables the following bioclimatic variables related to variation: BIO2 (mean diurnal temperature range), BIO7 (temperature annual range), BIO3 (isothermality), and BIO15 (precipitation seasonality). Finally, multiple regression on distance matrices (MRM function, ecodist package; Goslee and Urban 2007) was implemented to assess the correlation between the geographical and environmental (euclidean) distance matrixes for each ecosystem at both archipelago and island scales. To test for the geographic structuring of genetic variation within PBS, the fixation index NST was calculated using SPAGeDI 1.5. NST was calculated both among islands (each island is considered a population) and within individual islands (each sampling site is considered a population).