Data and R code for: Biogeography and conservation of bycatch decapods
Data files
Feb 23, 2026 version files 4.71 MB
-
Analisys_Models_LMER__GAM_and_SPAMM.R
4.92 KB
-
CCDB_OBIS_GeoMar_data.csv
2.64 MB
-
clados_info.csv
11.05 KB
-
Editing_mitogenomic_tree.R
2.29 KB
-
eez.shp
1.91 MB
-
eez.shx
108 B
-
Get_merge_predictors_and_calculate_PD__PE__SR.R
11.72 KB
-
Indices_correlations.R
808 B
-
Maps.R
7.29 KB
-
MPAs_hotspot_and_coldspot.R
4.90 KB
-
RDA_models.R
5.76 KB
-
README.md
8.52 KB
-
Spatial_random_forest.R
13.77 KB
-
Tree_genoma_OK.treefile
9.41 KB
-
tree_pruned.treefile
5.34 KB
-
VariablesDataOK_pix1x1.csv
76.56 KB
Abstract
Aim To investigate the spatial patterns of species richness (SR), phylogenetic diversity (PD), and phylogenetic endemism (PE) in bycatch decapod crustaceans along the Brazilian Exclusive Economic Zone and to assess the influence of environmental variables and trawling effort on these diversity indices.
Location Brazilian Exclusive Economic Zone.
Taxa Decapoda
Results Decapod diversity showed strong spatial heterogeneity across the EEZ. SR peaked in southeastern Brazil in upwelling-influenced ecotones, while PD and PE were highest in offshore and equatorial regions. PD was strongly associated with bottom temperature, primary productivity (PP) and current velocity, SR was mainly driven by salinity, light availability, and PP, and PE was best explained by temperature and PP. Most hotspots of phylogenetic diversity and endemism fell outside Marine Protected Areas (MPAs), revealing significant conservation gaps in offshore and southern regions.
Main conclusions This study provides the first macroecological framework of decapod diversity in the Brazilian EEZ, revealing that temperature, salinity, productivity, and current dynamics differentially shape biodiversity metrics. Southeastern and offshore zones host unique evolutionary lineages but remain underrepresented in current MPA networks. By integrating phylogenetic and environmental data, our findings offer a baseline to improve spatial biodiversity assessments and support more representative conservation planning under shifting ocean conditions.
This repository contains occurrence records, environmental predictors, phylogenetic trees, biodiversity indices, spatial layers, and R scripts used in the study:
“Decapod Biodiversity Hotspots and Environmental Drivers: A Macroecological Approach About Bycatch Species in Brazil”
Journal of Biogeography
https://doi.org/10.1111/jbi.70076
All files are described below to ensure independent interpretation and reuse without consulting the manuscript.
File inventory and description
CCDB_OBIS_GeoMar_data.csv
This file contains raw occurrence records of decapod crustaceans compiled from multiple sources, including:
- Crustacean Collection Database (CCDB)
- Ocean Biodiversity Information System (OBIS)
- GEOMAR Helmholtz Centre for Ocean Research
- Peer-reviewed literature
Each row represents a single occurrence record.
Variables:
- sp – Species name (as originally reported in the source database or publication).
- x – Longitude (decimal degrees, WGS84).
- y – Latitude (decimal degrees, WGS84).
Coordinates were standardized to decimal degrees (WGS84). No additional environmental or sampling metadata are included in this file.
clados_info.csv
This file links decapod crustacean species to higher taxonomic groupings used for phylogenetic tree visualization and downstream analyses.
Each row corresponds to a single species.
Variables:
- ID – Species identifier used in phylogenetic trees and analyses (formatted as Genus_species).
- family – Taxonomic family of each species.
- name_ok – Validated scientific name (Genus species).
- grupo – Higher-level taxonomic grouping (e.g., Achelata, Anomura) used for clade assignment and visualization.
Taxonomic names were standardized prior to analysis.
Tree_genoma_OK.treefile
Maximum likelihood mitogenomic phylogeny generated with IQ-TREE 2 using mitochondrial sequences retrieved from GenBank.
This file contains the full phylogenetic tree prior to pruning.
Format: Newick.
tree_pruned.treefile
Pruned phylogenetic tree including only species with spatial occurrence data.
This tree was used to calculate the following phylogenetic diversity metrics:
- PD – Phylogenetic Diversity (total branch length connecting species within a spatial unit).
- PE – Phylogenetic Endemism (phylogenetic diversity weighted by geographic range restriction).
- ED – Evolutionary Distinctiveness (amount of unique evolutionary history represented by each species).
- WE – Weighted Endemism (species endemism weighted by inverse range size).
Format: Newick.
VariablesDataOK_pix1x1.csv
Environmental predictors and biodiversity indices aggregated to a spatial grid of 1° × 1° covering the Brazilian Exclusive Economic Zone (EEZ).
Each row represents one grid cell.
Geographic coordinates
- Longitude – Longitude of grid cell centroid (decimal degrees)
- Latitude – Latitude of grid cell centroid (decimal degrees)
Environmental variables
All environmental variables were spatially summarized per grid cell from global marine datasets.
- ph – Seawater pH
- chlomean – Mean chlorophyll-a concentration
- chloRange – Chlorophyll-a range
- chloSS – Chlorophyll-a seasonal variability
- curvel – Mean ocean current velocity
- O2 – Dissolved oxygen concentration
- O2range – Dissolved oxygen range
- O2Lmax – Maximum dissolved oxygen
- nit – Nitrate concentration
- phosp – Phosphate concentration
- sal – Mean salinity (PSU)
- salrange – Salinity range
- salLmax – Maximum salinity
- tempmean – Mean bottom temperature (°C)
- temprange – Temperature range
- tempSS – Temperature seasonal variability
- tempLmax – Maximum temperature
- bathym – Bathymetry (depth, meters)
- iron – Dissolved iron concentration
- sil – Silicate concentration
- pp – Primary productivity
- ppSS – Primary productivity seasonal variability
- pprange – Primary productivity range
- light – Surface light availability
- carbophyto – Carbon biomass of phytoplankton
- carbophytoLmax – Maximum carbon phytoplankton biomass
- carbophytorange – Range of carbon phytoplankton biomass
- carbophytoSS – Seasonal variability of carbon phytoplankton biomass
- calcite – Calcite concentration
- SalSS – Salinity seasonal variability
Environmental layers were primarily obtained from Bio-ORACLE and related global marine products.
Fishing pressure
- fishing_effort – Fishing trawling effort derived from Global Fishing Watch
Biodiversity metrics
- SR – Species Richness (number of species per cell)
- PD – Faith’s Phylogenetic Diversity
- WE – Weighted Endemism
- PE – Phylogenetic Endemism
- ED – Evolutionary Distinctiveness
Standardized effect sizes:
- PD.SES – Standardized phylogenetic diversity
- WE.SES – Standardized weighted endemism
- PE.SES – Standardized phylogenetic endemism
- ED.SES – Standardized evolutionary distinctiveness
Missing values:
Cells filled with “n/a” indicate unavailable environmental or biodiversity values due to lack of coverage in the original raster layers or absence of biological records after spatial aggregation.
eez.shp and eez.shx
Shapefile representing the Brazilian Exclusive Economic Zone (EEZ), used as spatial mask and reference area for all analyses.
The EEZ spatial layer is provided as a shapefile and requires companion files to open correctly:
- eez.shp – geometry
- eez.shx – shape index
These files must be kept in the same folder for proper visualization and use.
Coordinate reference system: WGS84 (EPSG:4326)
Geometry type: polygon
Usage notes:
This shapefile can be opened using QGIS, ArcGIS, or R packages such as sf or terra.
Example in R:
library(sf)
eez <- st_read("eez.shp")
The EEZ layer is used exclusively to constrain analyses to Brazilian marine waters.
R workflow description
Editing_mitogenomic_tree.R
Edits and prunes the mitogenomic phylogeny to match species occurrence data.
Get_merge_predictors_and_calculate_PD__PE__SR.R
Calculates SR, PD, PE, ED, and WE using:
- picante
- phylobase
- phyloraster
Environmental predictors are merged with biodiversity metrics at 1° resolution.
Indices_correlations.R
Generates correlation matrices among biodiversity indices.
Analisys_Models_LMER__GAM_and_SPAMM.R
Fits environmental models using:
- Linear Mixed-Effects Models (LMER)
- Generalized Additive Models (GAM)
- Spatial mixed models (spaMM)
RDA_models.R
Redundancy Analysis (RDA) evaluating multivariate relationships between environmental predictors and biodiversity metrics.
Spatial_random_forest.R
Spatial Random Forest models (spatialRF) identifying key environmental drivers.
Maps.R
Produces spatial maps of biodiversity indices across the EEZ.
MPAs_hotspot_and_coldspot.R
Identifies biodiversity hotspots and coldspots and overlaps them with Marine Protected Areas to quantify conservation gaps.
Software requirements
Core software
- R ≥ 4.4
- IQ-TREE 2
- Geneious Prime
Main R packages
tidyverse, dplyr, readr, CoordinateCleaner
sf, raster, terra
picante, ape, phylobase, phyloraster, LetsR, SESraster
sdmpredictors, robis
vegan, adespatial, spaMM
randomForest, spatialRF
ggplot2, ggpubr, gridExtra
External data sources
- OBIS
- Global Fishing Watch
- GEOMAR
- Bio-ORACLE
- GenBank
- Protected Planet
Mitochondrial sequences used for phylogenetic reconstruction were retrieved from GenBank.
Citation
If using this dataset, please cite:
Teles, J. N. & Mantelatto, F. L. (2025). Decapod Biodiversity Hotspots and Environmental Drivers: A Macroecological Approach About Bycatch Species in Brazil. Journal of Biogeography.
https://doi.org/10.1111/jbi.70076
For the dataset:
Teles, J. N. (2025). Macroecology of bycatch decapods in the Brazilian EEZ: data and R workflows. Dryad Digital Repository. DOI: 10.5061/dryad.0zpc8678d
We compiled bycatch occurrence data from peer-reviewed studies and global biodiversity repositories. Diversity indices (SR, PD, PE, ED and WE) were estimated using a phylogenetic approach incorporating mitogenomic markers. Environmental and trawling effort data were extracted from global databases, and statistical analyses, including redundancy analysis (RDA), spatial regression and random forest models, were applied to determine the drivers of diversity.
