Data from: Warming conditions reduce the impacts of an aquatic invasive macrophyte across a latitudinal gradient
Data files
Sep 19, 2025 version files 16.26 MB
-
Muthukrishnan_and_Kalinowski_2025_Journal_of_Ecology_dataset.zip
16.25 MB
-
README.md
13.56 KB
Sep 19, 2025 version files 16.26 MB
-
Muthukrishnan_and_Kalinowski_2025_Journal_of_Ecology_dataset.zip
16.25 MB
-
README.md
13.56 KB
Abstract
The study evaluates the dynamics of a invasions by the aquatic invasive freshwater macroalga, Nitellopsis obtusa, in lakes across the upper Midwest of the US. Using interannual variability and a latitudinal gradient as a proxy for future climate change we assess the interaction between two global change stressors: climate change and biological invasions. This dataset includes all underlying data and code used for all analyses presented in the article the creation of all data figures. The data includes locations of sampling locations, field data on plant community composition and abundance, and climate data for sampling sites used to quantify annual conditions. All analysis code is written in the R statistical programing languge.
Dataset DOI: 10.5061/dryad.qrfj6q5vf
Description of the data and file structure
This data package accompanies the Journal of Ecology paper:
Muthukrishnan, R. and Kalinowski, C. 2025. Warming conditions reduce the impacts of an aquatic invasive macrophyte across a latitudinal gradient
Files and variables
File: Muthukrishnan_and_Kalinowski_2025_Journal_of_Ecology_dataset.zip
Description:
Within the compressed archive (Muthukrishnan_and_Kalinowski_2025_Journal_of_Ecology_dataset.zip) are a series of data files and R scripts that will load and process all relevant data, run analyses and produce statistical results and figures that are reported in the associated manuscript. When expanded the archive will create a file structure with all analysis scripts in the "src" subfolder, data files in the "data" subfolder, and additional "graphs" and "outputs" subfolders. The "graphs" and "outputs" subfolders will initially be empty, but figures and tables produced by the analysis scripts will be saved in those subfolders. The R working directory needs to be set to the main "Archive" folder in order for scripts to work correctly, but beyond that the script will be able to correctly identify locations for files.
To run analyses the user should open (or run via source) the file named "Starry_stonewort_climate_analysis_manager.R" in the src folder. This file will load all required packages (listed below, they will need to be installed locally by the user for the script to run) then will run the associated scripts for data loading, processing, and analysis. All figures and tables in the manuscript will be saved to the "graphs" and "outputs" subfolders, but other relevant analyses will be printed to the console or graphics window.
This script was developed and tested on R version 4.3.0
R Scripts (in the src subfolder):
Starry_stonewort_climate_analysis_manager.R
Main analysis file. This is the only script that needs to be run to recreate analyses. It will load required packages and then call additional helper and analysis scripts which will load data files, run analyses, and plot figures.
Confidence_interval_plot.R
A helper script with a function to calculate confidence intervals for predictions of linear models
Climate_data_setup.R
This file will load climate data and process it so that it can be used associated with sampling data and used in individual analyses.
Sampling_sites_map.R
This script will plot a map of all sampling site. It will also plot maps of each monitoring lake with points overlaid indicating the locations of each of the monitoring transects
Transect_analysis.R
This script processes and analyzes benthic cover diversity data collected from monitoring transects, including calculating year to year changes in the community at each meter location along the transects.
Annual_biomass_analysis.R
This script processes and analyzes biomass samples taken during peak biomass across multiple years
Phenology_analysis.R
This script processes and analyzes biomass data collected across the growing season to evaluate phenological patterns. This also includes analysis of historical data and calculation of date of peak biomass.
Data files:
Data files (in the data subfolder):
Most data files store tabular data (biomass, benthic cover or temperature data) for individual sites collected at particular times. These files are saved as .csv files and are described below. One file "lake_OSM_data.rdata" stores more complex data, including polygon geometries for lakes, so it is saved as a .rdata file that can be directly loaded in R.
all_year_phenology_data.csv
Tabular data of aquatic plant biomass collected at multiple timepoints across the year to quantify phenological patterns
Column A: state = US state of sampling location ("Minnesota", "Wisconsin", or "Indiana"; categorical)
Column B: lake = Lake name of sampling location ("Koronis", "Moose", "Wall", "Crooked", "Syracuse", "Wind", "Little_Muskego", "Pike", "Winnibigoshish", "Detroit_River"; categorical)
Column C: transect = Transect name of sampling location (varied; categorical)
Column D: sample = Sample number of sampling location (1-4; integer)
Column E: rect_date = Rectified sampling date (date values; ordinal)
Column F: year = Sampling year (2017-2021; integer)
Column G: Nitellopsis_obtusa = biomass of Nitellopsis obtusa in grams (0-7165; continuous)
Climate_data_raw.csv
Tabular data of aquatic plant benthic cover collected along monitoring transects
Column A: meter_mark = meter location along sampling transect (1:30; integer)
Column B: state = US state of sampling location ("Minnesota", "Wisconsin", or "Indiana"; categorical)
Column C: lake = Lake name of sampling location ("Koronis", "Moose", "Wall", "Crooked", "Syracuse", "Wind", "Little_Muskego", "Pike", "Winnibigoshish"; categorical)
Column D: transect = Transect name of sampling location ("T1", "T2", "T3", "T4", "T5"; categorical)
Column E: transect_ID_name = unique transect location name based on lake, transect and meter (varied, ex. "WallT1-1"; categorical)
Column F: date = sampling date (date values; ordinal)
Column G: year = Sampling year (2019-2021; integer)
Column H: run = which of multiple sampling rounds during the year that the data was collected (1-5; integer)
Column I: Nitellopsis_obtusa = Percent cover of Nitellopsis obtusa (0-100; continuous)
Column J: bare = Percent cover of bare ground (0-100; continuous)
Column K: Chara_species = Percent cover of Chara species (0-100; continuous)
Column L: Stuckenia_pectinata = Percent cover of Stuckenia pectinata (0-100; continuous)
Column M: Myriophyllum_sibiricum = Percent cover of Myriophyllum sibiricum (0-100; continuous)
Column N: Myriophyllum_spicatum = Percent cover of Myriophyllum spicatum (0-100; continuous)
Column O: Ceratophyllum_demersum = Percent cover of Ceratophyllum demersum (0-100; continuous)
Column P: Potamogeton_crispus = Percent cover of Potamogeton crispus (0-100; continuous)
Column Q: Potamogeton_richardsonii = Percent cover of Potamogeton richardsonii (0-100; continuous)
Column R: Potamogeton_praelongus = Percent cover of Potamogeton praelongus (0-100; continuous)
Column S: Potamogeton_friesii = Percent cover of Potamogeton friesii (0-100; continuous)
Column T: Potamogeton_pusillus = Percent cover of Potamogeton pusillus (0-100; continuous)
Column U: Potamogeton_gramineus = Percent cover of Potamogeton gramineus (0-100; continuous)
Column V: Potamogeton_illinoensis = Percent cover of Potamogeton illinoensis (0-100; continuous)
Column W: Potamogeton_amplifolius = Percent cover of Potamogeton amplifolius (0-100; continuous)
Column X: Potamogeton_zosteriformis = Percent cover of Potamogeton zosteriformis (0-100; continuous)
Column Y: Potamogeton_obtusifolius = Percent cover of Potamogeton obtusifolius (0-100; continuous)
Column Z: Potamogeton_strictifolius = Percent cover of Potamogeton strictifolius (0-100; continuous)
Column AA: Potamogeton_foliosus = Percent cover of Potamogeton foliosus (0-100; continuous)
Column AB: Elodea_canadensis = Percent cover of Elodea canadensis (0-100; continuous)
Column AC: Elodea_nuttallii = Percent cover of Elodea nuttallii (0-100; continuous)
Column AD: Vallisneria_americana = Percent cover of Vallisneria americana (0-100; continuous)
Column AE: Bidens_beckii = Percent cover of Bidens beckii (0-100; continuous)
Column AF: Utricularia_macrorhiza = Percent cover of Utricularia macrorhiza (0-100; continuous)
Column AG: Heteranthera_dubia = Percent cover of Heteranthera dubia (0-100; continuous)
Column AH: Najas_guadalupensis = Percent cover of Najas guadalupensis (0-100; continuous)
Column AI: Najas_marina = Percent cover of Najas marina (0-100; continuous)
Column AJ: Najas_flexilis = Percent cover of Najas flexilis (0-100; continuous)
Column AK: Lemna_trisulca = Percent cover of Lemna trisulca (0-100; continuous)
Column AL: Potamogeton_nodosus = Percent cover of Potamogeton nodosus (0-100; continuous)
Column AM: Potamogeton_natans = Percent cover of Potamogeton natans (0-100; continuous)
Column AN: Nuphar_variegata = Percent cover of Nuphar variegata (0-100; continuous)
Column AO: Nymphaea_odorata = Percent cover of Nymphaea odorata (0-100; continuous)
Column AP: Sagittaria_graminea = Percent cover of Sagittaria graminea (0-100; continuous)
Column AQ: Hippuris_vulgaris = Percent cover of Hippuris_vulgaris (0-100; continuous)
Column AR: Filamentous_algae = Percent cover of Filamentous algae (0-100; continuous)
Column AS: reeds = Percent cover of reeds (0-100; continuous)
Column AT: Total_native = Total percent cover of native species as sum of all individual native species (0-160.1; continuous)
Column AU: notes = Notes about individual data points (varied; text)
Column AV: missing = Flag for missing data points ("Missing"; categorical)
Climate_gradient_peak_annual_biomass_sampling.csv
Tabular data of aquatic plant biomass collected at annual peak
Column A: year = Sampling year (2017-2021; integer)
Column B: date = sampling date (date values; ordinal)
Column C: state = US state of sampling location ("Minnesota", "Wisconsin", or "Indiana"; categorical)
Column D: lake = Lake name of sampling location ("Koronis", "Moose", "Wall", "Crooked", "Syracuse", "Wind", "Little_Muskego", "Pike", "Winnibigoshish"; categorical)
Column E: transect = Transect name of sampling location ("T1", "T2", "T3", "T4", "T5"; categorical)
Column F: sample = Sample number of sampling location (1-4; integer)
Column G: N_obtusa_g = biomass of Nitellopsis obtusa in grams (0-9580; continuous) Note: Some entries are character values, these are corrected in the associated data processing script.
Column H: Natives_g = biomass of all native species combined in grams, when collected (0-7880; continuous)
climate_transect_locations.csv
Tabular data of GPS locations for all monitoring transects in individual lakes
Column A: Lake = Lake name of GPS location ("Koronis", "Moose", "Wall", "Crooked", "Syracuse", "Wind", "Little_Muskego", "Pike", "Winnibigoshish"; categorical)
Column B: Transect = Transect name of GPS location ("T1", "T2", "T3", "T4", "T5"; categorical)
Column C: Side = Native dominated vs Starry stonewort dominated end of transect for GPS location ("N", "S"; categorical)
Column D: lat = Latitude value of GPS location in decimal degrees (numeric value; continuous)
Column E: Lon = Longitude value of GPS location in decimal degrees (numeric value; continuous)
Column F: ele = Elevation value of GPS location in meters (numeric value; continuous)
lake_NOAA_climate_data.csv
Tabular data of monthly climate records for each monitoring lake. From the NOAA NClimGrid monthly dataset
Column A: date = Date for climate record (date values; ordinal)
Column B: lake = Lake name of climate record ("Koronis", "Moose", "Wall", "Crooked", "Syracuse", "Wind", "Little_Muskego", "Pike", "Winnibigoshish"; categorical)
Column C: year = Year of climate record (1984-2021; integer)
Column D: month = Month of climate record (1-12; integer)
Column E: season = Season of climate record ("winter", "spring", "summer", "fall"; integer)
Column F: season_adj_year = Year of climate record adjusted to include December as the first month of winter associated with the following year (1984-2022; integer)
Column G: t_max = Monthly average of maximum daily temperature in degress Celsius of climate record ("winter", "spring", "summer", "fall"; integer)
OSM_lake_bboxes.csv
Tabular data of bounding boxes for Open Street Maps shapefiles of lakes
Column A: lake = Lake name of spatial record ("Koronis", "Moose", "Wall", "Crooked", "Syracuse", "Wind", "Little Muskego", "Pike", "Winnibigoshish"; categorical)
Column B: x_min = Minimum x coordinate for Open Street Maps bounding box for lake in decimal degrees (numeric value; continuous)
Column C: y_min = Minimum x coordinate for Open Street Maps bounding box for lake in decimal degrees (numeric value; continuous)
Column D: x_max = Minimum x coordinate for Open Street Maps bounding box for lake in decimal degrees (numeric value; continuous)
Column E: y_max = Minimum x coordinate for Open Street Maps bounding box for lake in decimal degrees (numeric value; continuous)
Column F: polygon_num = Index value of appropriate polygon for lake shapefile in associated OSM data file (1-7; integer)
Column G: poly_or_multipoly = Code for whether lake shapefile in associated OSM data file is a polygon or multipolygon type ("poly", "multipoly"; integer)
SSW_monitoring_lake_coordinates.csv
Tabular data of monitoring lake coordinates
Column A: lake = Lake name of spatial record ("Koronis", "Moose", "Wall", "Crooked", "Syracuse", "Wind", "Little Muskego", "Pike", "Winnibigoshish"; categorical)
Column B: longitude = Longitude value of lake location centroid in decimal degrees (numeric value; continuous)
Column C: latitude = Latitude value of lake location centroid in decimal degrees (numeric value; continuous)
lake_OSM_data.rdata
R list with Open Street Maps shapefiles for all monitoring lakes.
Code/software
Required additional R packages
tidyverse
dplyr
colorRamps
scales
terra
sf
lme4
lmerTest
maps
arm
osmdata
ggthemes
