Data from: Similar physical, geographic, and historical factors shape fish species richness in lakes across Ontario and Northern and Western Europe
Data files
Mar 19, 2026 version files 4.81 MB
-
EU_SurveySpecies_PresAbs_V1_EU5_Species_ManuscriptData.csv
17.62 KB
-
EU_SurveySpecies_PresAbs_V1_EU9_Species_ManuscriptData.csv
174.31 KB
-
hybas_eu_lev09_v1c_selected_TableToExcel_PFAF_IDsplit_manuscript.csv
328.43 KB
-
hybas_na_lev09_v1c_TableToExcel_PFAF_IDsplit_manuscript.csv
417.42 KB
-
LakeID_PFAFID_EU.csv
43.80 KB
-
LakeID_PFAFID_ON.csv
219.83 KB
-
Project1_EU_PFAF59_count_PASpUpdated_manuscript.csv
37.17 KB
-
Project1_joinedON_Rich_NA59.2.PASpUpdated_manuscript.csv
191 KB
-
Project1_NestedTempAnalysis_RCode_Clean.R
18.26 KB
-
Project1_RandomInterceptAnalysis_5_9_Clean.R
23.02 KB
-
Project1_Rarefaction_Chao_RCode_Clean.R
136.85 KB
-
Project1_RichnessModelling_Data_EU.csv
213.63 KB
-
Project1_RichnessModelling_Data_ON.csv
1.12 MB
-
Project1_RichnessModelling_Data_ONandEU.csv
1.36 MB
-
Project1_SpeciesRichnessModels_AHI_Clean.R
46.66 KB
-
Project1_SpeciesRichnessModels_AHI_EU_Clean.R
30.05 KB
-
Project1_SpeciesRichnessModels_EU_Clean.R
26.72 KB
-
README.md
10.55 KB
-
SurveySpecies_PresAbs_V1_NA5_Species_ManuscriptData.csv
12.44 KB
-
SurveySpecies_PresAbs_V1_NA9_Species_ManuscriptData.csv
373.44 KB
Abstract
Species diversity patterns and their drivers are essential for understanding and mitigating threats to freshwater ecosystems worldwide. We examined limnetic fish biodiversity across 9350 lakes in Ontario and 1824 lakes in northern and western Europe, using data on lake glacial history and environmental conditions to identify the main factors influencing species richness. We applied log-log linear mixed-effects models (LMM) and generalized linear mixed-effects models (GLMM) with a Poisson distribution to assess relationships between fish species richness and climatic, geographic, physical, and historical variables. Local variation in diversity was further explored using beta diversity and nested ‘temperature’ analyses based on sub-basins defined by the HydroBASINS database. Across both regions, lake area emerged as the strongest predictor of species richness, with additional influences from lake elevation, morphometry, age, and longitude. LMM and GLMM results were broadly consistent, and model error structures were shaped by the sub-basin organization in each landscape. Beta diversity was consistently high (>0.9), with species turnover driving most variation and nestedness contributing no more than 22%. Matrix ‘temperature’ values were similar between continents (~4° to ~40°). Overall, physical, geographic, and historical factors similarly affected fish richness in Ontario and Europe, and sub-basin spatial structure played a key role in shaping model error. These findings highlight variables that generally influence biodiversity and reveal local diversity patterns, providing insights for landscape-level conservation planning for lakes in both regions.
https://doi.org/10.5061/dryad.2547d7x1m
Description of the data and file structure
For Ontario lakes, fish species richness data was collected from the Aquatic Habitat Inventory Program through the Ministry of Natural Resources in Ontario. In addition to presence-absence data for the fish species, measurements of lake area, maximum lake depth, latitude, and longitude were included. For fine scale data, please reach out to the OMNR (https://www.ontario.ca/feedback/contact-us?id=26930&nid=65620). For European lakes, fish species richness measurements were collected for the European Water Framework Directive, assembled by the European project WISER (Water Bodies in Europe) collected between 1990 and 2010. Lake characteristic measurements for European lakes were obtained using methodologies comparable to those used for Ontario lakes.
Lake age, the length of time since last glaciation, was taken from the ICE_7G_NA dataset developed by Roy and Peltier (2018) using models of glacial isostatic adjustment.
Individual lake elevation values came from the open-source provincial digital elevation model (PDEM) created by the Ministry of Natural Resources and Forestry in Ontario and from the WISER database in Europe.
Watersheds come from the open online database HydroBASINS.
Maximum monthly air temperature measurements were acquired from the Climatic Research Unit’s website at the University of East Anglia.
To use the same lakes/sub-basins in the beta-diversity analyses as were used in the richness models, the "survey species" data files were matched up with their respective Ontario/Europe richness data files using the PFAF_ID columns.
Files and variables
File: Project1_RichnessModelling_Data_ON.csv
Description: Species richness values and predictor variables for species richness modelling for lakes in Ontario. Used as well for rarefaction and chao richness estimation.
Variables
- Lake_ID: Unique lake identifier
- Rich_AHEU: Species richness values
- Area_km2: Area (km squared)
- Elev_mean_dev9_PDEM: Mean elevation values of sub-basins at level 9 were subtracted from individual lake elevations to create this predictor variable of the deviation of each lake from their sub-basin elevation
- MxMonTP: Maximum monthly temperature in degree Celsius
- LakeAge_I7G: Lake age in years
- Latitude
- Depth_Max: Maximum depth in metres
- Longitude
- HYBAS_ID#: Unique identifier to associate the lake with the sub-basin that it is located in (sub basin levels 5-9)
- MEAN__m_9: Sub-basin level 9 mean elevation
File: Project1_RichnessModelling_Data_EU.csv
Description: Species richness values and predictor variables for species richness modelling for lakes in Europe. Used as well for rarefaction and chao richness estimation.
Variables
- Lake_ID: Unique lake identifier
- Rich_AHEU: Species richness values
- Area_km2: Area (km squared)
- Elev_mean_dev9: Mean elevation values of sub-basins at level 9 were subtracted from individual lake elevations to create this predictor variable of the deviation of each lake from their sub-basin elevation
- MxMonTP: Maximum monthly temperature
- LakeAge_I7G: Lake age
- Latitude
- Depth_Max: Maximum depth
- Longitude
- HYBAS_ID#: Unique identifier to associate the lake with the sub-basin that it is located in (sub basin levels 5-9)
- MEAN__m_9: Sub-basin level 9 mean elevation
File: Project1_RichnessModelling_Data_ONandEU.csv
Description: Species richness values and predictor variables for species richness modelling for lakes in Ontario and Europe.
Variables
- Lake_ID: Unique lake identifier
- Rich_AHEU: Species richness values
- Area_km2: Area (km squared)
- Elev_mean_dev9: Mean elevation values of sub-basins at level 9 were subtracted from individual lake elevations to create this predictor variable of the deviation of each lake from their sub-basin elevation
- MxMonTP: Maximum monthly temperature
- LakeAge_I7G: Lake age
- Latitude
- Depth_Max: Maximum depth
- Longitude
- HYBAS_ID#: Unique identifier to associate the lake with the sub-basin that it is located in (sub basin levels 5-9)
- MEAN__m_9: Sub-basin level 9 mean elevation
- Continent_num: Continent identifier (Ontario - 0, Europe - 1)
File: SurveySpecies_PresAbs_V1_NA5_Species_ManuscriptData.csv
Description: Data for beta diversity and nested temperature analyses. Specifically, presence-absence data for fish species found within the level 5 sub-basins in Ontario.
Variables
- PFAF_ID: Sub-basin level 5 unique identifier
- Species Names: Unique species identifiers. Provides presence-absence values for each species
- Richness_BH: Species richness values
File: SurveySpecies_PresAbs_V1_NA9_Species_ManuscriptData.csv
Description: Data for nested temperature, beta diversity, and rarefaction analyses. Specifically, presence-absence data for fish species found within the level 9 sub-basins in Ontario.
Variables
- HYBAS_ID5: Sub-basin level 5 unique identifier
- HYBAS_ID9: Sub-basin level 9 unique identifier
- PFAF_ID: Sub-basin level 9 unique identifier
- Species Names: Unique species identifiers. Provides presence-absence values for each species
- Richness: Species richness values
File: EU_SurveySpecies_PresAbs_V1_EU5_Species_ManuscriptData.csv
Description: Data for beta diversity and nested temperature analyses. Specifically, presence-absence data for fish species found within the level 5 sub-basins in Europe.
Variables
- PFAF_5: Sub-basin level 5 unique identifier
- Species Names: Unique species identifiers. Provides presence-absence values for each species
- Richness: Species richness values
File: EU_SurveySpecies_PresAbs_V1_EU9_Species_ManuscriptData.csv
Description: Data for nested temperature, beta diversity, and rarefaction analyses. Specifically, presence-absence data for fish species found within the level 9 sub-basins in Europe.
Variables
- HYBAS_ID5: Sub-basin level 5 unique identifier
- HYBAS_ID9: Sub-basin level 9 unique identifier
- PFAF_ID9: Sub-basin level 9 unique identifier
- Species Names: Unique species identifiers. Provides presence-absence values for each species
- Richness: Species richness values
File: LakeID_PFAFID_ON.csv
Description: PFAF_ID variables used to match beta-diversity files to richness files for Ontario.
Variables
- Lake_ID: Unique lake identifier
- PFAF_ID5: Sub-basin level 5 unique identifier
- PFAF_ID9: Sub-basin level 9 unique identifier
File: LakeID_PFAFID_EU.csv
Description: PFAF_ID variables used to match beta-diversity files to richness files for Europe.
Variables
- Lake_ID: Unique lake identifier
- PFAF_ID5: Sub-basin level 5 unique identifier
- PFAF_ID9: Sub-basin level 9 unique identifier
File: Project1_joinedON_Rich_NA59.2.PASpUpdated_manuscript.csv
Description: Data used for rarefaction and chao estimation.
Variables
- PFAF_ID9: Sub-basin level 9 unique identifier
- Lake_ID: Unique lake identifier
- Richness_9: Species richness at sub-basin level 9
File: Project1_EU_PFAF59_count_PASpUpdated_manuscript.csv
Description: Data used for rarefaction and chao estimation.
Variables
- PFAF_ID9: Sub-basin level 9 unique identifier
- Lake_ID: Unique lake identifier
- Richness_9: Species richness at sub-basin level 9
File: hybas_na_lev09_v1c_TableToExcel_PFAF_IDsplit_manuscript.csv
Description: Data used for total number of level 9 sub-basins in level 5 sub-basins.
Variables
- PFAF_: Sub-basin level 9 unique identifier
File: hybas_eu_lev09_v1c_selected_TableToExcel_PFAF_IDsplit_manuscript.csv
Description: Data used for total number of level 9 sub-basins in level 5 sub-basins.
Variables
- PFAF_: Sub-basin level 9 unique identifier
R Code Files (with associated .csv files):
R code to run nested temperature analysis for each level 5 sub-basin in both Ontario and northern and western Europe. Nested ‘temperature’ can be used to identify the types of sub-basins that are occupied by rare and invasive species and assist in decision making around which sub-basins are of conservation concern:
- Project1_NestedTempAnalysis_RCode_Clean.R
- SurveySpecies_PresAbs_V1_NA9_Species_ManuscriptData.csv
- SurveySpecies_PresAbs_V1_NA5_Species_ManuscriptData.csv
- EU_SurveySpecies_PresAbs_V1_EU9_Species_ManuscriptData.csv
- EU_SurveySpecies_PresAbs_V1_EU5_Species_ManuscriptData.csv
R code to estimate total beta diversity, species turnover, and nestedness using the family of Sorensen dissimilarity measures in the Ontario and northern and western Europe datasets:
- Project1_RandomInterceptAnalysis_5_9_Clean.R
- SurveySpecies_PresAbs_V1_NA5_Species_ManuscriptData.csv
- SurveySpecies_PresAbs_V1_NA9_Species_ManuscriptData.csv
- LakeID_PFAFID_ON.csv
- Project1_RichnessModelling_Data_ON.csv
- EU_SurveySpecies_PresAbs_V1_EU5_Species_ManuscriptData.csv
- EU_SurveySpecies_PresAbs_V1_EU9_Species_ManuscriptData.csv
- LakeID_PFAFID_EU.csv
- Project1_RichnessModelling_Data_EU.csv
R code to run rarefaction analyses and the Chao species richness estimator to understand the effects of variation in sampling intensity at the sub-basin scale:
- Project1_Rarefaction_Chao_RCode_Clean.R
- Project1_joinedON_Rich_NA59.2.PASpUpdated_manuscript.csv
- Project1_EU_PFAF59_count_PASpUpdated_manuscript.csv
- SurveySpecies_PresAbs_V1_NA9_Species_ManuscriptData.csv
- EU_SurveySpecies_PresAbs_V1_EU9_Species_ManuscriptData.csv
- hybas_na_lev09_v1c_TableToExcel_PFAF_IDsplit_manuscript.csv
- hybas_eu_lev09_v1c_selected_TableToExcel_PFAF_IDsplit_manuscript.csv
R code to run log–log linear mixed-effects models (LMM) and generalized linear mixed-effects models (GLMM) with a Poisson distribution which were used to identify associations between fish species richness and climatic, geographic, physical, and historical variables:
- Project1_SpeciesRichnessModels_AHI_EU_Clean.R
- Project1_RichnessModelling_Data_ONandEU.csv
- Project1_SpeciesRichnessModels_AHI_Clean.R
- Project1_RichnessModelling_Data_ON.csv
- Project1_SpeciesRichnessModels_EU_Clean.R
- Project1_RichnessModelling_Data_EU.csv
