Data from: Deep refuges: The distribution of marine fish in warming subtropics
Data files
Jan 26, 2026 version files 4.78 MB
-
Diet.csv
1.06 KB
-
Fish_base_fish_size_a_b.csv
2.54 KB
-
fish_survey_db.csv
4.61 MB
-
Fish_Trait.csv
1.67 KB
-
Fish.FactorsGrp.wis.csv
117 KB
-
Lessepsian.csv
42.60 KB
-
monitoring_sites.csv
416 B
-
README.md
6.92 KB
Abstract
In light of global climate change, identifying critical marine habitats and conserving them is essential. Marine conservation planning recommends designating cooler habitats as marine protected areas. The ‘deep-reef refugia’ hypothesis suggests that deeper, suitable habitats may allow species to undergo the evolutionary changes necessary to adapt to the growing environmental threats they face. This hypothesis has rarely been tested outside tropical ecosystems. This study, using a systematic approach, is the first to evaluate this hypothesis regarding fish communities in the East Mediterranean Sea (EMS), which is warming at an unprecedented rate. Fish were surveyed twice a year, using closed-circuit rebreather systems, from 2015 to 2022 across three rocky habitats: shallow (10 m depth, 23 % ± 11 of 1 m Photosynthetically Active Radiation, PAR), upper mesophotic (25 m depth; 8 % ± 4 of 1 m PAR), and lower mesophotic (45 m depth; 3 % ± 2 of 1 m PAR), where summer maximal temperatures reach a mean of 29.69 °C, 28.66 °C, and 27.9 °C, respectively. Data collected from 357 belt transects indicate that: 1) species composition and functional diversity of the shallow habitat are encompassed within those of the deeper habitats; 2) species diversity is greater in the upper mesophotic community compared to the shallow community; 3) abundance is reduced at mesophotic depths. Unlike most findings on tropical coral ecosystems, our results suggest that a mixed fish community of indigenous and immigrant species is currently thriving at upper mesophotic depths. This habitat appears to act as a slightly cooler climate-change refuge for a less diverse, shallower community, or as a step along the way to tropicalization. The unique position of the EMS as a transitional marine environment emphasizes its potential role as an early indicator of changes in fish depth distributions that could globally impact subtropical ecosystems.
Dataset DOI: 10.5061/dryad.76hdr7t8x
Description of the data and file structure
The underwater visual fish surveys were carried out by the Morris Kahn Marine Research Station team as a part of a long-term ecological research of the rocky reefs of the southeastern Mediterranean, starting 2015.
Files and variables
File: Diet.csv
Description:
'Diet' variable characterized each species’ primary food source using six categories: obligate herbivore (H), macro-invertivore (Ma), micro-invertivore (Mi), planktivore (PK), piscivore (FC), and omnivore (O). The data was based on and expanded from the dataset used by (Givan et al. 2017). NA value was given when the Diet type for this species was not found in the literature.
Givan, O., Parravicini, V., Kulbicki, M., & Belmaker, J. (2017). Trait structure reveals the processes underlying fish establishment in the Mediterranean: Non-indigenous fish trait structure. Global Ecology and Biogeography: A Journal of Macroecology, 26(2), 142–153.
Variables
- species:
- Diet:
File: Fish.FactorsGrp.wis.csv
Description:
Variables
-
: row number
-
site_id: site name, underline, site depth in (m)
-
site: the site name
-
reserve: if the site is a no take reserve the value is R
-
sample_year: the data year of survey as integer
-
sample_date: date of survey
-
depth_category: site depth (m)
-
season: Fall/ Spring
-
visibility: estimated distance of visibility in meters
-
date_id: excell automated date value
-
taxonomy_group: Fishes
-
long: the specific site_id's longitude (WGS84)
-
lat: the specific site_id's latitude (WGS84)
-
tag: each belt transect id
For the next epibenthic coverage variables, the Hellinger transformed count of grid points where the group was present
-
Algae:
-
Bryozoan:
-
Hard.Substrate:
-
Hydrozoa:
-
Soft.Substrate:
-
Sponge:
For the next habitat heterogeneity indices, temperature and mean depth HOBO Onset depth gauge readings were used. Heterogeneity indices were calculated according to Lazarus:
Lazarus, M., & Belmaker, J. (2021). A review of seascape complexity indices and their performance in coral and rocky reefs. Methods in Ecology and Evolution / British Ecological Society, 12(4), 681–695.NA values are entered when the data was not collected
-
Rugosity:
-
SD:
-
VerRelief:
-
Slope:
-
Curvature:
-
mean_depth:
-
mean_temp:
-
LogVR:
-
TempDiff: the difference between the depth category of the specific transect and strata above it. in 45m habitats (temp at 45m- temp at 25m) or in the 25m habitat (temp. at 25m -temp at 10m)
-
n_species_sample: number of species in the belt transect (integer)
-
n_ind_sample: number of fish individuals in the belt transect
The next are the fish species name and counts in each belt transect
-
Acantholabrus palloni:
-
Argyrosomus regius:
-
Atherinomorus forskalii:
-
Balistes capriscus:
-
Cheilodipterus novemstriatus:
-
Chromis chromis:
-
Coris julis:
-
Dasyatis pastinaca:
-
Diplodos cervinus:
-
Diplodus cervinus:
-
Diplodus puntazzo:
-
Diplodus sargus:
-
Diplodus vulgaris:
-
Echeneis sp:
-
Epinephelus costae:
-
Epinephelus marginatus:
-
Fistularia commersonii:
-
Himantura uarnak:
-
Muraena helena:
-
Mycteroperca rubra:
-
Pagrus auriga:
-
Pagrus caeruleostictus:
-
Parablennius rouxi:
-
Parupeneus forsskali:
-
Pteragogus pelycus:
-
Pterois miles:
-
Rhabdosargus haffara:
-
Salpa salpa:
-
Saprisoma cretense:
-
Sardin sp:
-
Sargocentron rubrum:
-
Scarus ghobban:
-
Sciaena umbra:
-
Scomber sp:
-
Scomberomorus commerson:
-
Seriola dumerili:
-
Serranus cabrilla:
-
Serranus scriba:
-
Siganus luridus:
-
Siganus rivulatus:
-
Sparisoma cretense:
-
Symphodus mediterraneus:
-
Symphodus sp:
-
Symphodus tinca:
-
Synodus saurus:
-
Taeniurops grabatus:
-
Thalassoma pavo:
-
Torquigener flavimaculosus:
File: monitoring_sites.csv
Description:
Morris Kahn Marine Research Station monitoring sites are named and located in this file. Only Sdot-Yam and Achziv are used in the current publication.
Variables
- site_id: site name, underline, site depth in (m)
- long: site_id's longitude
- lat: site_id's latitude
File: Fish_Trait.csv
Description: The traits of fishes in this study . NA values are used when data is missing
Variables
- species:
- Home range: Mobile (MOV), very mobile (VMob), sedentary (Sed)
- Activity: Day/ Night
- Schooling: Solitary (Sol), Pair, large groups (LargeG), small groups (smallG), medium sized groups (MedG)
- Water level: swimming high above the reef( High), low, or at the bottom
- Diet: 'Diet' variable characterized each species’ primary food source using six categories: obligate herbivore (H), macro-invertivore (Ma), micro-invertivore (Mi), planktivore (PK), piscivore (FC), and omnivore (O). The partition to micro/macro invertivore was according to the available information in FishBase (Froese and Pauly 1994)
- size category: Body size was coded using six ordered categories: 0–7 cm (1), 7.1–15 cm (2), 15.1–30 cm (3), 30.1–50 cm (4), 50.1–80 cm (5), and >80 cm (6), according to maximal fish sizes available at the FishBase dataset in the ‘rfishbase’ R package.
File: Fish_base_fish_size_a_b.csv
Description: Fish Base size to biomass conversion constants a and b, taken from the ‘rfishbase’ R package, with the locality where it was estimated. NA values are used when data is missing
Variables
- Fbspecies_chosen:
- a:
- b:
- Locality:
File: fish_survey_db.csv
Description:
Variables
- transect:
- a_b: disregard this one
- site:
- reserve:
- habitat:
- sample_date:
- sample_year:
- depth:
- depth_category:
- season:
- site_id:
- tag:
- visibility:
- factor_visibility: 1/visibility
- observer: the name of the observer
- species: fish species
- size: estimated size
- distance: estimated distance from the transect line
- date_id: the date
- taxonomy_group: not relevant
File: Lessepsian.csv
Description: Is this species a Lesspsian introduced fish (Red sea origin) or native
Variables
- species:
- Lessepsian: introduced = 1, native = 0 . NA values are used when data is missing
Code/software
The data is .csv, can be viewed in excel or text editor and any data analysis programs.
Access information
Other publicly accessible locations of the data:
Morris Kahn Marine Research Station data base access: https://med-lter.haifa.ac.il/database-hub/
download from:
Data was derived from the following sources:
- Invertebrate survey/ photo survey |for epibenthic coverage
- Fish survey/ Fish survey
The EMS water temperatures range from 17 ℃ during the winter to over 30 ℃ when the water column is highly stratified (Ben Ezra et al. 2021). The subtidal rocky reefs along this coast are made of Kurkar (aeolian-formed sandstone), which can be found on the shelf up to 100 m in depth (Neev & Ben-Avraham, 1977), as far north as Haifa (Figure 1). From Haifa northwards, most of the reef base is limestone.
The study was performed in two sites: Sdot-Yam and Achziv (Figure 1). The Sdot-Yam rocky reefs are on outcrops of Kurkar stone, ranging from 6 to 48 m depth. The reefs of Achziv are limestone-based and located within a 100 km2 managed marine protected area on the verge of an underwater canyon formation and at a distance of 3.2 km from the nearest border of the reserve. The Sdot-Yam site is not protected from fishing. The study locations (at depths of 10 m, 25 m, and 45 m) in both sites were selected by examining their bathymetric profiles: sites with relatively large rock size, complexity, slope, and distance from the rock’s edge.
Underwater visual censusThe study sites were reached by boat and located using Geographic Positioning Systems coordinates between 9 and 10 a.m. Four belt transects (25 m long and 10 m wide) were performed at each site and depth. The same surveyor surveyed these transects twice a year in the spring (April to May) and autumn (October to November) between 2015 and 2022, thus avoiding surveyor bias. Megalodon closed-circuit rebreather systems allowed divers to survey at 45 m depth and also minimized the disturbance to the fish community created by the bubbles of standard SCUBA diving systems (Pyle, Lobel, and Tomoleoni 2016).
Upon arrival at the study site, a weighted buoy was deployed as a plumb, and two divers descended along the line. Five minutes later, the fish survey started with the fish surveyor swimming to the north while extending a 25 m roulette meter behind them and recording the assessed visibility distance of that transect. The fishes were identified, and their size, abundance, and distance from the transect line were narrated in the recorder. After returning to the plumb, the surveyors continued 10 m to the south and carried out another transect in the southern direction. Upon returning to the plumb, the survey team extended another roulette 50 m to the west, and the third transect was surveyed again to the north; after returning to the origin, they continued 10 m to the south and then completed the fourth transect to the south. This resulted in 12 belt transects per study site and season, four per depth,, each with an area of 250 m2. A total of 357 fish census belt transects were conducted: 159 in Achziv and 158 in Sdot-Yam (185 in spring and 172 in autumn) between 2015 and 2022. Of these, 126 transects were in the shallow habitat, 124 in the upper mesophotic, and 107 in the lower mesophotic (a detailed table can be found in the Supplementary Information, SI).
We conducted the surveys using the underwater visual census method (Jennings and Polunin 1995), suitable for the most abundant rocky habitat species (Guidetti et al. 2014). However, following Brokovich et al. (2008), we used video as our registry tool instead of pencil and board, where the diver speaks and annotates the fish when they were observed. Videography enables reviewing previous surveys and consulting with experts to identify newly encountered species and can be used as an archive. Sedentary and small cryptic species were excluded from statistical analyses due to technical difficulties in surveying them (Ault and Johnson 1998). We held a water level data logger (Onset HOBO) and waved it at a high frequency to mark the beginning and end of each transect (Dustan, Doherty, and Pardede 2013), then dragged it along each of the 25m transect lines at a slow and stable pace (~0.2 m/s), logging at 1- s intervals. In the lab, we later assessed structural complexity (according to Lazarus and Belmaker 2021) and the mean temperature of the study habitats.
The environmental dataset was complemented by long-term, weekly CTD (Seabird SBE 19v2) measurements taken by the Ruppin Marine Station from 2014-2023. This temperature data (station 1) was measured 12 km south of the Sdot-Yam site at 48 m depth by Suari et al. (2019). The temperature data readings for the 20 hottest surface temperature casts were selected to represent the summer months.
Photosynthetically active radiation (PAR, Biosperical/Licor) data was obtained from the seasonal CTD casts (2018-2024) conducted 20 km north of Sdot-Yam, at 45-70 m seafloor depth stations (data available at: https://med-lter.haifa.ac.il/database-hub/ stations TSY 45, TSY 55, and TSY 70). Additional descriptions and analyses of environmental parameters can be found in SI.
Data management was carried out in R (R Core Team 2024) using the following packages: ‘readr’(Wickham, Hester, and Bryan 2024), ‘plyr’(Wickham 2011), ‘dplyr’ (Wickham et al. 2023), ‘reshape2’ (Wickham 2007), ‘labdsv’ (Roberts 2019), ’tidyverse’ (Wickham et al. 2019), ‘stringr’ (Wickham 2019) and ‘here’(Müller 2020). The figures were prepared in R with ‘ggplot2’ (Wickham 2016) and ‘ggpubr’ (Kassambara 2020). Statistical analysis was carried out with ‘stats’(R Core Team 2024) and ‘rstatix’ (Kassambara 2021). When statistical assumptions—independence of observations, normality of residuals, homogeneity of residuals, or the absence of significant outliers for using ANOVA—were not met, we used the Wilcoxon rank-sum (W) test as an alternative.
Multivariate analysesTo investigate patterns in species assemblages, we employed Redundancy Analysis (RDA, available in the 'vegan' package; Oksanen et al. 2022), a multivariate statistical technique that extends principal components analysis to directly relate multiple response variables to multiple explanatory variables.
Given the presence of several schooling species occurring in high abundances, which could potentially skew the analysis, we applied Log2 and Hellinger transformations to the species abundance data before analysis. Both transformations are particularly useful for species abundance data as they assign lower weights to rare species, and make the data more suited to meet the assumptions of linear methods such as RDA. We implemented a forward stepwise model selection procedure (‘ordistep()’ function available in the 'vegan' package; Oksanen et al. 2022) using a comprehensive set of explanatory factors: season (autumn/fall), visibility distance, marine protected area (yes/no), HOBO measured rugosity and depth, the temperature difference between habitats, and the amount of habitat covered in sediment (see SI for details). This approach allowed us to systematically identify and include the factors that best explained the variability in the fish assemblage composition. The stepwise selection helps in building a parsimonious model by including only the most influential variables, thereby reducing noise and improving the interpretability of the results.
DiversityAlpha, beta, and gamma diversity are ecological concepts that measure biodiversity at different spatial scales (Whittaker 1972). Alpha diversity refers to the diversity within a particular sample. Beta diversity measures the change in species composition of a community along a gradient in an ecosystem. High beta diversity suggests significant differences in species composition between areas, while low beta diversity indicates similarity. Gamma diversity refers to the overall diversity within a geographical area (Whittaker, 1972). Gamma diversity is influenced by alpha and beta diversities; it can be viewed as a combination of local (alpha) diversity and differentiation (beta) among those localities. The degree of differentiation between the samples would be the total number of species divided by the mean number of species per sample (Whittaker, 1972).
Diversity was calculated according to the generalization of Hill numbers suggested by Chao et al. (2019) to measure taxonomic and functional diversity. The order of the index, set by the value of Hill number ‘q’, determines the sensitivity to species relative abundance with the three common diversity indexes as special cases: Richness (q = 0), Shannon diversity (q = 1), and Simpson diversity (q = 2). It is calculated according to the following equation per species ‘i’ and its relative proportion in the community (p); the resulting D is the “equivalent number of species” .
Habitat gamma diversity, accounting for the different sampling efforts, was carried out using rarefaction profiles with bootstrapped confidence intervals with the iNEXT() function from the R package `iNEXT’(Hsieh, Ma, and Chao 2016). Alpha diversity was computed using Hill number diversity orders (Jost 2010). We applied two approaches to calculate beta diversity. The multiplicative approach, where beta = gamma/alpha, and beta diversity (Sorensen), composed of the degree of nestedness and turnover within a community, using beta.sample() of the R ‘betapart’ package (Baselga and Orme 2012) randomly choosing five sampling units at each iteration and repeating this a hundred times (samples = 100) to accommodate for the different sampling efforts of each site.
Functional diversityWe categorized the fish data according to six traits related to ecological functions: mobility, activity, schooling, position in the water column, diet, and size category. These traits affect food acquisition, movement range, and behavior. ‘Mobility’ included sedentary, mobile (MOB), and very mobile (VMOb) criteria. ‘Activity’ described times of activity: diurnal (Day), nocturnal (Night), or both. ‘Schooling’ consisted of solitary (SOL), pairs, small groups (SmallG), medium-sized groups (MedG), and large groups (LargeG). Position in the water column related to high, low, and bottom. 'Diet' variable characterized each species’ primary food source using six categories: obligate herbivore (H), macro-invertivore (Ma), micro-invertivore (Mi), planktivore (PK), piscivore (FC), and omnivore (O). The data was based on and expanded from the dataset used by (Givan et al. 2017). The partition to micro/macro invertivore was according to the available information in FishBase (Froese and Pauly 1994). Body size was coded using six ordered categories: 0–7 cm (1), 7.1–15 cm (2), 15.1–30 cm (3), 30.1–50 cm (4), 50.1–80 cm (5), and >80 cm (6), according to maximal fish sizes available at the FishBase dataset in the ‘rfishbase’ R package. The resulting trait combination for each fish species is available in the supplementary information.
Alpha functional diversity per Hill number ‘q’ was calculated according to (Chao et al., 2019), with equal attribute contribution weights . Beta Sorensen distances were calculated using minimum trait tau (Magneville et al. 2022).
A Principal Component Analysis ordination was carried out on log2, and Hellinger transformed abundance data per unique trait combination using ‘princomp()’ to portray the fish community functional space while reducing the effect of schooling and highly abundant species. The overlaying trait combination correlation vector data with PCA axes was calculated using the envfit() function.
Species-specific modelingGeneral Additive Models (GAMs) are a powerful and flexible extension of traditional linear models, and are used to model the relationship using smooth, non-linear functions. This allows them to capture complex patterns in the data without making rigid assumptions about the shape of the relationship. We used R libraries ‘stringr’ and ‘mgcv’ to model the depth preferences of species with more than 40 occurrences at all depths: Thalassoma pavo, Coris julis, Parupeneus forsskali, Serranus cabrilla, Siganus rivulatus, Diplodus vulgaris, Pterois miles, Epinephelus marginatus, Mycteroperca rubra, Sargocentron rubrum, Sparisoma cretense, and Siganus luridus. These abundance data were modeled for the effect of mean depth (k = 5), season, rugosity (k = 5), curvature (k = 5), slope(k = 5), and site on their abundance data, using a negative binomial distribution and a logarithmic link function.
