Leveraging species associations patterns of macroalga wrack to predict their spatial distribution
Data files
Jun 01, 2026 version files 8.01 MB
-
ALAMER_Dryad.zip
8.01 MB
-
README.md
2.36 KB
Abstract
Species associations are increasingly recognized as an effective way to improve species distribution model (SDM) predictions by integrating biotic and abiotic information that is difficult to measure in situ. This approach appears promising for monitoring communities where biodiversity and environmental data are sparse or limited. To evaluate its applicability to such communities and to test whether associations estimated in narrow spatial context can capture useful information and be transferred to predict spatial distributions across larger regions, we focused on macroalgal communities in beach wrack. To estimate the information captured by macroalgae species associations, we assessed the predictive improvement they provide when added to models of increasing complexity using a random forest approach. We also examined how species traits influence association patterns among 58 species observed at 130 sites in Brittany, western France. Finally, we quantified the improvement in predictive performance when applying these associations to species distribution models across 402 sites along the French Atlantic coast. We found that species associations both complement and substitute remote sensing and fine-scale habitat descriptors in predicting individual species distributions. Association patterns were related to certain macroalgae traits (e.g., buoyancy status, initial benthic habitats). They significantly improved large-scale predictive performance, despite being derived from limited sample sizes and narrow spatial contexts, confirming that such associations capture more than initial or local environmental factors. Using macroalgal communities in beach wrack, we illustrate that species associations can serve as alternative predictors for communities where biotic and abiotic data are scarce or hard to access. We also highlight their usefulness for predicting at broader spatial scales than those used for their estimation. Leveraging these associations offers promising tools to enhance spatial conservation planning, identify sampling biases, and guide efficient surveys (e.g., in exotic or rare species of high conservation concern).
Data and code description - Leveraging on species associations patterns of macroalga wrack to predict their spatial distribution (ALAMER_Dryad.zip)
Data description:
In the folder "Data" there are files corresponding to:
- dataframe containing data from species occurrence surveyed at the initial geographical area from non-floating ones ("Community_data_no_floating.RData") and floating ones ("Community_data_floating.RData"). All variables in these three files are the names of the species in term of occurrence (0 = absence / 1 = presence).
- The abiotic factors computed from remote sensing using raster from June 2020 described in section 2.2 ("XData_abiotic_RS.RData") including:
- Mean_Temp = average monthly sea surface temperature (in Kelvin)
- Min/Mean/Max_Bathy = minimum, average and maximum bathymetry (in meters)
- Mean_ChlA = Chlorophylle A concentration (in mg/m3)
- Mean_Trub = Water turbidity (in Nephelometric Turbidity Units)
- Mean_Sal = Average water salinity (in g/L)
- Mean_Hauteur = Monthly average waves height (in meters)
- and the fine-habitat benthic coastal categories detailed in section 2.3 ("Y_hab_FineHabitat.RData"). All these variables follow the same pattern, with the suffix ‘Aire/Area_’ followed by the name of the coastal categories in question. All the area are the surface expressed in square meters (in m2).
All these data are used to compute model in script 1.
Lastly, "Community_data_no_floating_outside.RData" containing data from species occurrerence (0 = absence / 1 = presence) surveyed outside the initial geographical area and used for prediction in script 2.
Code description:
In the folder "Code", they are the main scripts used to run the three different analysis presented in the manuscript, using R software (version 4.2.2).
- 1 - Fitting random forest models.R --> script used to analyze macroalgae species presence-absence with random forest. Results are presented in Figure 2, Figure 3, Figure 5, Figure S1, Figure S5, and Figure S6.
- 2 - Predicting distribution outside areaR -> script used to predict macroalgae non-floating species presence-absence with random forest fitted in the script 1. Resultas are presented in Figure 4, Figure S3, and Figure S4.
We also include a folder "Output" where the results of the code presented above are stored.
