Skip to main content

Datasets relating (i) A wetland fish multimetric index to variation in agricultural stress among Laurentian Great Lakes coastal wetlands, (ii) Cyanobacteria biomass to total phosphorus concentrations among Canadian lakes

Cite this dataset

Tomal, Jabed; Ciborowski, Jan (2020). Datasets relating (i) A wetland fish multimetric index to variation in agricultural stress among Laurentian Great Lakes coastal wetlands, (ii) Cyanobacteria biomass to total phosphorus concentrations among Canadian lakes [Dataset]. Dryad.


We present two datasets of biological responses against environmental stresses. In the first dataset, the biological response and environmental stress variables are fish multimetric index of community health and agricultural stress, respectively, in watersheds draining to Laurentian Great Lakes. In the second dataset, the biological response and environmental stress variables are cyanobacterial biomass and total phosphorus, respectively, in Canadian Lakes.


Relating wetland fish multimetric index to variation in agricultural stress among Laurentian Great Lakes coastal wetlands:

The first dataset relates to estimating threshold effects of a measure of agricultural activity in watersheds draining into the Laurentian Great Lakes on scores of a multimetric index of community composition of fishes in bordering coastal wetlands (Bhagat et al., 2007). Run-off associated with agriculture is a major source of human induced disturbance affecting natural habitat loss for fishes. Danz et al. (2005) derived a composite agricultural stress index (AG) to characterize the risk of degradation of natural habitat using GIS based data. The measure of biological condition is a wetland fish multimetric index (MMI), a measure representing the inferred health of the fish assemblage in an ecoregion or watershed. Uzarski et al. (2005) developed and Bhagat et al. (2007) validated the fish multimetric index by assessing fish assemblages in stands of bulrush (Schoenoplectus, spp) in 30 coastal wetlands distributed across the US Great Lakes coast. MMI scores vary from 0 to 100, with larger scores representing greater ecological health of the fish assemblage. Traditionally, MMI scores falling in the lowest and highest quintiles are classified as “degraded” and “excellent” conditions, respectively. Bhagat et al. (2007) observed a statistically significant negative linear association between fish MMI and AG scores, but suggested the presence of threshold responses. They did not quantitatively test for the presence of breakpoints. Tomal and Ciborowski (2020) derived ecological models to test for the presence of two environmental breakpoints. Dr. Jan Ciborowski provided the dataset (the top panel of Figure 3 - Bhagat et al., 2007) who was one of the coauthors and the principal investigator. Tomal and Ciborowski (2020) rescaled the AG (a PCA score) to a 0 to 1 range with larger numbers reflecting more extensive agricultural activities.

Relating cyanobacteria biomass to total phosphorus concentrations among Canadian lakes:

The second dataset relates to identifying putative threshold effects of total phosphorus (TP) on the risk of development of harmful algal blooms (dominated by toxigenic Cyanobacteria) in Canadian lakes (Beaulieu et al., 2014). TP is a limiting nutrient whose loads to lakes and rivers reflect contributions of sewage from urban centres, agricultural runoff, and other manifestations of human activity. Cyanobacteria biomass (CB) per unit volume is a standard index of concentration, and often used as a proxy for the risk of toxicity of harmful algal blooms. Cyanobacteria blooms are manifestations of eutrophication whose prevalence is increasing globally. CB harbours compounds that can be acutely toxic, and that are linked to diseases such as carcinoma. Thus, CB is directly related to risks to human and animal health. Opinion on the shape of the relationship between TP and CB is varied. TP is arguably one of the top single predictors of CB (Chlorophyll a), and empirically derived linear models are widely used in lake management. However, sigmoidal relationships between TP and CB are also well documented. Using linear regression, nonlinear regression and mixed-effects models, Beaulieu et al., 2014 concluded that linear models better explained the data pattern than nonlinear approaches. Yet, scatterplots (the top-left panel of Figure 3 of Beaulieu et al., 2014) appear to indicate discontinuities in the TP-CB relationship. Tomal and Ciborowski (2020) used a piecewise-linear quantile regression model to estimate two environmental thresholds of TP on CB. The dataset is obtained from, where the original data are collected by the Ministries of the Environment of Alberta (43 lakes), British Columbia (10 lakes) and Ontario (97 lakes) relating to CB and TP concentrations. Tomal and Ciborowski (2020) extracted, aggregated, and scaled (using 10-based log) the cyanobacterial biomass and total phosphorus variables.


  1. Beaulieu, M., F. Pick, M. Palmer, S. Watson, J. Winter, R. Zurawell, and I. Gregory-Eaves (2014). Comparing predictive cyanobacterial models from temperate regions. Canadian Journal of Fisheries and Aquatic Sciences 71 (12), 1830-1839.
  2. Bhagat, Y., J. Ciborowski, L. Johnson, D. Uzarski, T. Burton, S. Timmermans, and M. Cooper (2007). Testing a fish index of biotic integrity for responses to different stressors in great lakes coastal wetlands. Journal of Great Lakes Research 33, 224-235.
  3. Danz, N., R. Regal, G. Niemi, V. Brady, T. Hollenhorst, L. Johnson, G. Host, J. Hanowski, C. Johnston, T. Brown, J. Kingston, J. Kelly (2005). Environmentally stratified sampling design for the development of Great Lakes environmental indicators. Environmental Monitoring and Assessment 102 (1), 41-65.
  4. Tomal, J., and Ciborowski, J. (2020). Ecological models for estimating breakpoints and prediction intervals. Ecology and Evolution, In press.
  5. Uzarski, D., T. Burton, M. Cooper, J. Ingram, and S. Timmermans (2005). Fish habitat use within and across wetland classes in coastal wetlands of the five great lakes: Development of a fish-based index of biotic integrity. Journal of Great Lakes Research 31, 171-187.

Usage notes

Tomal and Ciborowski (2020) used the fish multimetric index versus agricultural stress dataset and the cyanobacterial biomass versus total phosphorus dataset to estimate two environmental thresholds using a piecewise-linear regression model and a piecewise-linear quantile regression model. These datasets can further be used to extract other forms of linear and non-linear relationships between the biological responses and environmental stress variables. Both of the datasets are provided in CSV format (DataS1.csv & DataS2.csv). The R-codes to read the datasets are provided in DataS1.R and DataS2.R files. The R-codes to run the models in Tomal and Ciborowski (2020) can be obtained by emailing the corresponding author Dr. Jabed Tomal at

DataS1.csv dataset contains three variables: (1) Fish_MMI - Fish multimetric index of community health; (2) AG - Agricultural stress; and (3) Type - U: Uzarski & GLEI: Great Lakes environmental indicators.

DataS2.csv dataset contains four variables: (1) CB - Cyanobacterial biomass (µg/L); (2) TP - Total phosphorus (µg/L); (3) Lake - Name of lakes; and (4) Province - Name of provinces: AB: Alberta, BC: British Columbia, & ON: Ontario.


Natural Sciences and Engineering Research Council