Integrating biotic interactions in niche analyses unravels the patterns underneath community composition in clownfishes
Data files
Apr 13, 2023 version files 46 MB
-
3U_estimates_simplified.csv
23.73 KB
-
amph_behavior.csv
990 B
-
environmental_dataset.csv
18.15 MB
-
GG_spatGLMmodel_predictions.csv
45.08 KB
-
GS_spatGLMmodel_predictions.csv
47.52 KB
-
interaction_matrix.csv
1.45 KB
-
marine_regions.csv
6.30 MB
-
niche_overlaps.csv
44.08 KB
-
overall_spatGLMmodel_predictions.csv
44.61 KB
-
README.md
23.34 KB
-
spatial_results.csv
21.27 MB
-
SS_spatGLMmodel_predictions.csv
47.15 KB
-
summary_spatial.csv
3 KB
Sep 24, 2024 version files 46 MB
-
3U_estimates_simplified.csv
23.73 KB
-
amph_behavior.csv
990 B
-
environmental_dataset.csv
18.15 MB
-
GG_spatGLMmodel_predictions.csv
45.08 KB
-
GS_spatGLMmodel_predictions.csv
47.52 KB
-
interaction_matrix.csv
1.45 KB
-
marine_regions.csv
6.30 MB
-
niche_overlaps.csv
44.08 KB
-
overall_spatGLMmodel_predictions.csv
44.61 KB
-
README.md
21.48 KB
-
spatial_results.csv
21.27 MB
-
SS_spatGLMmodel_predictions.csv
47.15 KB
-
summary_spatial.csv
3 KB
Abstract
Biotic interactions shape the ecology of species and communities, yet their integration into ecological niche modeling methods remains challenging. Despite being a central topic of research for the past decade, the impact of biotic interactions on species distributions and community composition is often overlooked. Mutualistic systems offer ideal case studies for examining the effects of biotic interactions on species niches and community dynamics. This study presents a novel approach to incorporating mutualistic interactions into niche modeling, using the clownfish-sea anemone system. By adapting existing niche quantification frameworks, we developed a method to estimate the partial effects of known interactions and refine ecological niche estimates. This approach allows for a more comprehensive understanding of how mutualistic relationships influence species distributions and community assembly patterns. We also used mutualistic information to investigate the resource-use overlap, identitying patterns of competition within clownfish communities. Our results reveal significant deviations in niche estimates when biotic interactions are considered, particularly for specialist species. Host partitioning among clownfish species reduces resource-use overlap, facilitating coexistence in species-rich habitats and highlighting mutualism’s role in promoting and maintaining diversity. We uncover complex dynamics in resource-use overlap among clownfish species, influenced by factors such as species richness, ecological niche overlap, and host specialization. Specialist-generalist interactions strike an optimal balance, supporting high species richness while minimizing competition. These insights enhance our understanding of clownfish biodiversity patterns, demonstrating how diverse mutualistic strategies contribute to diversity build-up and mitigate competitive exclusion in saturated communities. The analytical framework presented has broad applications beyond the clownfish-sea anemone system, potentially extending to a broader range of interactions. It enables a more comprehensive understanding of biodiversity maintenance in complex ecosystems and constitutes a valuable tool for conservation planning and ecosystem management.
README: Reference Information
Provenance for this README
- File name: README_Dataset_IBINAOC.txt
- Authors: Alberto Garcia Jimenez
- Other contributors: Olivier Broennimann, Antoine Guisan, Théo Gaboriau and Nicolas Salamin
- Date created: 2023-02-20
- Date modified: 2023-03-30
Dataset Version and Release History
- Current Version:
- Number: 1.0.0
- Date: 2023-03-30
- Persistent identifier: DOI: 10.5061/dryad.2bvq83bv8
- Summary of changes: n/a
- Embargo Provenance: n/a
- Scope of embargo: n/a
- Embargo period: n/a
Dataset Attribution and Usage
- Dataset Title: Data for the article "Integrating biotic interactions in niche analyses unravels the patterns underneath community composition in clownfishes"
- Persistent Identifier: https://doi.org/10.5061/dryad.2bvq83bv8
- Dataset Contributors:
- Creators: Alberto Garcia Jimenez, Olivier Broennimann, Antoine Guisan, Théo Gaboriau and Nicolas Salamin
- Date of Issue: 2023-03-30
- Publisher: n/a
- License: Use of these data is covered by the following license:
- Title: CC0 1.0 Universal (CC0 1.0)
- Specification: https://creativecommons.org/publicdomain/zero/1.0/; the authors respectfully request to be contacted by researchers interested in the re-use of these data so that the possibility of collaboration can be discussed.
- Suggested Citations:
- Dataset citation: > Garcia Jimenez, Alberto et al. (2023), Integrating biotic interactions in niche analyses unravels the patterns underneath community composition in clownfishes, Dryad, Dataset, https://doi.org/10.5061/dryad.2bvq83bv8
- Corresponding publication: > n/a > Contact Information
- Name: Alberto Garcia Jimenez
- Affiliations: Department of Computational Biology, University of Lausanne
- ORCID ID: https://orcid.org/0000-0002-1532-8784
- Email: agarcia26286@gmail.com
- Alternate Email: alberto.garciajimenez@unil.ch
- Alternative Contact: PI
- Name: Nicolas Salamin
- Affiliations: Department of Computational Biology, University of Lausanne
- ORCID ID: https://orcid.org/0000-0002-3963-4954
- Email: nicolas.salamin@unil.ch
- Address: Genopode, University of Lausanne, 1015 Lausanne, Switzerland
- Contributor ORCID IDs:
- Alberto Garcia Jimenez: https://orcid.org/0000-0002-1532-8784
- Olivier Broennimann: https://orcid.org/0000-0001-9913-3695
- Antoine Guisan: https://orcid.org/0000-0002-3998-4815
- Théo Gaboriau: https://orcid.org/0000-0001-7530-2204
- Nicolas Salamin: https://orcid.org/0000-0002-3963-4954
Additional Dataset Metadata
Dates and Locations
- Dates of data collection: Data collected between May 2019 and July 2022
- Sources of data collection: Geo-referenced occurrences were retrieved from GBIF (last reference record: GBIF.org (30 March 2023) GBIF Occurrence Download https://doi.org/10.15468/dl.7vn4tq)
Occurrence data sets were obtained from RLS, GBIF, OBIS and Hexacoral (Atlas of Living Australia 2017; GBIF.org 2018; OBIS 2017; Fautin 2008 respectively)
Environmental data were obtained from GMED (Basher et al. 2018) and Bio-Oracle (Tyberghein et al. 2012; Assis et al. 2018) using the same resolution and extent for both datasets (0.083 x 0.083 cell size, representing approximately 9,2 km near the equator) and filtered to locations corresponding to the shallow reefs of the Indo-Pacific Ocean plus the epipelagic zone above 50m depth using a map of the locations of warm waters coral reefs from UNEP-WCMC (UNEP-WCMC 2018).
The marine provinces and realm delimitations (Figure S1) were obtained from MEOW (Spalding et al. 2007).
Methodological Information
- Methods of data collection/generation: see manuscript for details
Data and File Overview
Summary Metrics
- File count: 58
- Total file size: 83.05 MB
- Range of individual file sizes: 158 bytes - 38.84 MB
- File formats: .csv, .R, .png, p.pdf, .tar.gz, .zip,
Table of Contents
The contents of this dataset are:
- Data: selected environmental variables, defined marine provinces and filtered geo-referenced occurrences of 28 clownfish species and 10 sea anemones species to carry out ENMs.
- Scripts: ordered scripts to run reproduce analyses and results sequentially and scripts to reproduce the main and supplementary figures and tables from the corresponding uncompressed zip file.
- Results: data sets containing the results of the study
- Main figures, supplementary figures in .png format and tables in .pdf format.
Data sets:
amph_behavior.csv
interaction_matrix.csv
environmental_dataset.csv
marine_regions.csv
Results:
3U_estimates_simplified.csv
niche_overlaps.csv
summary_spatial.csv
spatial_results.csv
GG_spatGLMmodel_predictions.csv
GS_spatGLMmodel_predictions.csv
SS_spatGLMmodel_predictions.csv
overall_spatGLMmodel_predictions.csv
Scripts:
supplementary_scripts.zip
spatialGLMM.R
GLM_and_moranItests.R
main_figures.R
0a_occ_datapreparation.R
0b_env.var.selection.R
5_spatialCompetition.R
4_nicheOverlaps.R
0c_marine_regions.R
3_3Umetrics.R
1_environmental_niche.R
2_biotic_correction.R
supp_figures.R
tables.R
Compressed files:
NINA_0.1.0.tar.gz
figures_scripts.zip
Setup
- Unpacking instructions: n/a
- Relationships between files/folders: n/a
- Recommended software/tools: parentage- RStudio 2021.09.2; R version > 3.6.3
File/Folder Details
Details for: amph_behavior.csv
- Description: a comma-delimited file containing the mutualistic behavior of clownfishes
- Format(s): .csv
- Size(s): 990 bytes
- Dimensions: 29 rows x 2 columns
- Variables:
- species: species name in the form of GENUS_SPECIES (character)
- mutualism: generalist or specialist (character)
- Missing data codes: NA
Details for: interaction_matrix.csv
- Description: a comma-delimited file containing the interaction information between clownfishes and sea anemones.
- Format(s): .csv
- Size(s): 1.45 KB
- Dimensions: 29 rows x 11 columns
- Variables:
- species: species name in the form of GENUS_SPECIES (character)
- Columns 2 to 11: values of 0 or 1 indicating absence of interaction or mutualistic interaction, respectively, with the species name in the form of GENUS_SPECIES of sea anemones in the header (numeric)
- Missing data codes: NA
Details for: environmental_dataset.csv
- Description: a comma-delimited file containing the environmental information of the study area.
- Format(s): .csv
- Size(s): 18.150 MB
- Dimensions: 1023560 rows x 10 columns
- Variables:
- x: decimal Longitude (numeric)
- y: decimal Latitude (numeric)
- Current.Velocity.Mean: Current velocity mean at max depth (m/s; numeric)
- Salinity.Mean: Salinity concentration mean (PSS; numeric)
- Temperature.Mean: Temperature mean at bottom (°C; numeric)
- Nitrate.Mean: Nitrate concentration mean (μmol/m3; numeric)
- Nitrate.Range: Nitrate concentration range (μmol/m3; numeric)
- Chlorophyll.Mean: Chlorophyll concentration mean (mg/m3; numeric)
- Dissolved.oxygen.Range: Dissolved oxygen concentration range (μmol/m3; numeric)
- Phytoplankton.Mean: Phytoplankton concentration mean (μmol/m3; numeric)
- Missing data codes: NA
Details for: marine_regions.csv
- Description: a comma-delimited file containing the province and real delimitation of the study area.
- Format(s): .csv
- Size(s): 6.3 MB
- Dimensions: 102360 rows x 4 columns
- Variables:
- x: decimal Longitude (numeric)
- y: decimal Latitude (numeric)
- province: Province name (character)
- realm: Real name (character)
- Missing data codes: NA
Details for: 3U_estimates_simplified.csv
- Description: a comma-delimited file containing the results of UUU analyses.
- Format(s): .csv
- Size(s): 23.73 KB
- Dimensions: 108 rows x 36 columns
- Variables:
- region: Province name (character)
- species: species name in the form of GENUS_SPECIES (character)
- D: Shoener's D value between the environmental niche and the corrected niche (numeric). 1 - D equals niche dissimilarity.
- CS: Centroid shift value (numeric)
- ER: Environmental Shift value (numeric)
- Unavailable: Unavailable proportion of the niche (numeric).
- Used: Used proportion of the niche (numeric).
- Unoccupied: Unoccupied proportion of the niche (numeric).
- n.hosts: number of interacting hosts (integer)
- realm: marine realm relative to the marine region (character).
- host_use: Proportion of host use relative to the number of sea anemones present inhabiting the region (numeric).
- n.hosts.reg: number of interacting hosts present in the region (integer).
- behavior.reg: mutualistic behavior relative to n.host.reg (character).
- spatial_Unoccupied: Unoccupied proportion of the distribution range (numeric).
- spatial_Used : Used proportion of the distribution range (numeric).
- spatial_Unavailable: Unavailable proportion of the distribution range (numeric).
- Cramer_statistic: Value of the Cramer statistic for the given observations (numeric).
- Cramer_pvalue: Estimated p-value of the Cramer statistic test (numeric).
- Missing data codes: NA
Details for: niche_overlaps.csv
- Description: a comma-delimited file containing the results of niche overlap analyses.
- Format(s): .csv
- Size(s): 44 KB
- Dimensions: 275 rows x 10 columns
- Variables:
- region: Province name (character)
- spa: species name in the form of GENUS_SPECIES (character)
- spb: species name in the form of GENUS_SPECIES (character)
- Denvironmental: Niche overlap between spa and spb environmental niches
- Dcorrected: Niche overlap between spa and spb environmental niches after correcting by biotic interactions
- Dhost-specific: Niche overlap between spa and spb environmental niches considering habitat/resource partitioning
- shared_hosts: number of shared hosts between spa and spb (integer)
- p_host_shared: proportion of shared host between spa and spb (numeric)
- interaction: type of interactions (specialist-specialist, generalist-specialists, generalist-generalist) (character)
- c_host_shared: category related to the number of shared hosts (none, some, all) (character)
- Missing data codes: NA
Details for: summary_spatial.csv
- Description: a comma-delimited file containing the summary of spatial interaction analyses.
- Format(s): .csv
- Size(s): 3 KB
- Dimensions: 18 rows x 27 columns
- Variables:
- province: Province name (character)
- x: average decimal Longitude of the province (numeric)
- y: average decimal Latitude of the province(numeric)
- amph.richness: average clownfish richness per province (numeric)
- anem.richness: average sea anemone richness per province (numeric)
- nG: average number of generalists per province (numeric)
- nS: average number of specialists per province (numeric)
- T.int: average number of potential interactions (numeric)
- D.total: average distribution overlap among species (numeric)
- H.total: average habitat overlap among species (considering host use) (numeric)
- EN.total: average corrected environmental niche overlap among species (numeric)
- EC.total: average host-specific niche overlap among species (numeric)
- GG: average distribution overlap among generalist species (numeric)
- GS: average distribution overlap among generalist-specialist interactions (numeric)
- SS: average distribution overlap among specialist species (numeric)
- TGG: average number of generalist-generalist interactions
- TGS: average number of generalist-specialist interactions
- TSS: average number of specialist-specialist interactions
- HGG: average habitat overlap among generalist species (considering host use) (numeric)
- HGS: average habitat overlap among generalist-specialist interactions (considering host use) (numeric)
- HSS: average habitat overlap among specialist species (considering host use) (numeric)
- ENGG: average corrected environmental niche overlap among generalist species (numeric)
- ENGS: average corrected environmental niche overlap among generalist-specialist interactions (numeric)
- ENSS: average corrected environmental niche overlap among specialist species (numeric)
- ECGG: average host-specific niche overlap among generalist species (numeric)
- ECGS: average host-specific niche overlap among generalist-specialist interactions (numeric)
- ECSS: average host-specific niche overlap among specialist species (numeric)
- Missing data codes: NA
Details for: spatial_results.csv
- Description: a comma-delimited file containing the results of the spatial interaction analyses.
- Format(s): .csv
- Size(s): 23.3 MB
- Dimensions: 75151 rows x 27 columns
- Variables:
- province: Province name (character)
- x: decimal Longitude of the province (numeric)
- y: decimal Latitude of the province(numeric)
- amph.richness: clownfish richness per province (numeric)
- anem.richness: sea anemone richness per province (numeric)
- nG: number of generalists per province (numeric)
- nS: number of specialists per province (numeric)
- T.int: number of potential interactions (numeric)
- D.total: distribution overlap among species (numeric)
- H.total: habitat overlap among species (considering host use) (numeric)
- EN.total: corrected environmental niche overlap among species (numeric)
- EC.total: host-specific niche overlap among species (numeric)
- GG: distribution overlap among generalist species (numeric)
- GS: distribution overlap among generalist-specialist interactions (numeric)
- SS: distribution overlap among specialist species (numeric)
- TGG: number of generalist-generalist interactions
- TGS: number of generalist-specialist interactions
- TSS: number of specialist-specialist interactions
- HGG: habitat overlap among generalist species (considering host use) (numeric)
- HGS: habitat overlap among generalist-specialist interactions (considering host use) (numeric)
- HSS: habitat overlap among specialist species (considering host use) (numeric)
- ENGG: corrected environmental niche overlap among generalist species (numeric)
- ENGS: corrected environmental niche overlap among generalist-specialist interactions (numeric)
- ENSS: corrected environmental niche overlap among specialist species (numeric)
- ECGG: host-specific niche overlap among generalist species (numeric)
- ECGS: host-specific niche overlap among generalist-specialist interactions (numeric)
- ECSS: host-specific niche overlap among specialist species (numeric)
- Missing data codes: NA
Details for: GG_spatGLMmodel_predictions.csv
- Description: a comma-delimited file containing the model predictions and the CI of the host-specific niche overlap in response to species richness and environmental niche overlap in generalist-generalist interactions.
- Format(s): .csv
- Size(s): 45 KB
- Dimensions: 625 rows x 5 columns
- Variables:
- nG: number of generalist species (integer)
- ENGG: environmental niche overlap (numberic)
- predicted: predicted host-specific niche overlap (numberic)
- fixefVar_0.025: Confidence interval lower threshold (numberic)
- fixefVar_0.975: Confidence interval higher threshold (numberic)
- Missing data codes: NA
Details for: GS_spatGLMmodel_predictions.csv
- Description: a comma-delimited file containing the model predictions and the CI of the host-specific niche overlap in response to species richness and environmental niche overlap in generalist-specialist interactions.
- Format(s): .csv
- Size(s): 48 KB
- Dimensions: 625 rows x 5 columns
- Variables:
- amph.richness: number of species (integer)
- ENGS: environmental niche overlap (numberic)
- predicted: predicted host-specific niche overlap (numberic)
- fixefVar_0.025: Confidence interval lower threshold (numberic)
- fixefVar_0.975: Confidence interval higher threshold (numberic)
- Missing data codes: NA
Details for: SS_spatGLMmodel_predictions.csv
- Description: a comma-delimited file containing the model predictions and the CI of the host-specific niche overlap in response to species richness and environmental niche overlap in specialist-specialist interactions.
- Format(s): .csv
- Size(s): 47 KB
- Dimensions: 625 rows x 5 columns
- Variables:
- nS: number of specialist species (integer)
- ENSS: environmental niche overlap (numberic)
- predicted: predicted host-specific niche overlap (numberic)
- fixefVar_0.025: Confidence interval lower threshold (numberic)
- fixefVar_0.975: Confidence interval higher threshold (numberic)
- Missing data codes: NA
Details for: overall_spatGLMmodel_predictions.csv
- Description: a comma-delimited file containing the model predictions and the CI of the host-specific niche overlap in response to species richness and environmental niche overlap in all types of interactions.
- Format(s): .csv
- Size(s): 45 KB
- Dimensions: 625 rows x 5 columns
- Variables:
- amph.richness: number of generalist species (integer)
- EN.Total: environmental niche overlap (numberic)
- predicted: predicted host-specific niche overlap (numberic)
- fixefVar_0.025: Confidence interval lower threshold (numberic)
- fixefVar_0.975: Confidence interval higher threshold (numberic)
- Missing data codes: NA
Description of the script files and structure
0 indexed scripts are preliminary filtering and data manipulation. Shared input data sets are the result of those scripts.
1_environmental_niche.R carries out the PCA-based ENMs and it is run twice, once for the sea anemones and once for the clownfishes. These scripts will create folders containing the niche models.
This is the following command line to run on the shell:
Rscript scripts/1_environmental_niche.R data/environmental_dataset.csv data/anem_occ_final_dataset.csv marine_regions.csv results anem_ENMs
Rscript scripts/1_environmental_niche.R data/environmental_dataset.csv data/amph_occ_final_dataset.csv marine_regions.csv results amph_ENMs
2_biotic_correction.R performs the correction of the niche in base to the mutualistic associations. This script will create a folder with the corrected niche model.
This is the following command line to run on the shell:
Rscript scripts/2_biotic_correction.R anem_ENMs amph_ENMs data/interaction_matrix.csv results amph_EBMs
3_Umetrics.R estimates the UUU parameters to compare the ecological niche with the corrected niche and assess the impact of biotic interactions. This script outputs the file '3U_estimates.csv'.
This is the following command line to run on the shell:
Rscript scripts/3_3Umterics.R
4_niche_overlap.R estimates the niche overlap among clownfish species at a province level and outputs the file 'niche_overlaps.csv'
5_spatialCompetition.R will compute the spatial distributions of the different niche overlaps and also subsetting them by the type of interactions. This script will output the files 'spatial_results.csv' and 'summary_spatial.csv', where the former is the information per location and the latter is the averaged values per province.
GLM_and_moranItests.R runs Generalized Linear Models and explores the spatial autocorrelation of the 'spatial_results.csv' data set.
spatialGLMM.R runs spatial Generalized Linear Mixed Models and creates model predictions for a surface plot shown in Figure S8. It will output the files 'overall_spatGLMmodel.csv', 'GG_spatGLMmodel.csv', 'GS_spatGLMmodel.csv', and 'SS_spatGLMmodel.csv' containing the models' results that are used in the script 'TableS4.R' to create the table, and the files 'overall_spatGLMmodel_predictions.csv', 'GG_spatGLMmodel_predictions.csv', 'GS_spatGLMmodel_predictions.csv', and 'SS_spatGLMmodel_predictions.csv' containing the data set with the predictions to plot running the script FigureS8.R
All main figures can be obtained, once all 5 indexed scripts have been run by running the script 'main_figures.R'
All supplementary figures can be obtained, once all mentioned scripts have been run by running the script 'supp_figures.R'
Once all mentioned scripts have been run, all tables can be obtained by running the script 'tables.R'.
Other scripts with additional models tested and analyses are also provided in supplementary scripts.
Code/Software
These analyses are run in R and require the installation of certain packages, all available in CRAN repositories but one. The main functions used in this study are custom-made and built up in the package NINA (not published yet on CRAN repositories). This package is within the contents of this dataset as a tar file named 'NINA_0.1.0.tar.gz'. The installation of this package will run automatically with the first script '1_environmental_niche.R'.
END of README
Methods
Occurrences, environmental variables and all input data for the analyses have been collected from online databases and open-source datasets.
Usage notes
All analyses are done in R. Developed functions are provided in form of a package called NINA.