Data from: Interactions of wood accumulations, channel dynamics, and geomorphic heterogeneity within a river corridor
Data files
May 03, 2024 version files 27.28 KB
Abstract
Natural rivers are inherently dynamic. Spatial and temporal variations in water, sediment, and wood fluxes both cause and respond to an increase in geomorphic heterogeneity within the river corridor. We analyze 16 two-kilometer river corridor segments of the Swan River in Montana, USA to examine relationships between wood accumulations (wood accumulation distribution density, count, and persistence), channel dynamism (total sinuosity and average channel migration), and geomorphic heterogeneity (density, aggregation, interspersion, and evenness of patches in the river corridor). We hypothesize that i) more dynamic river segments correlate with a greater presence, persistence, and distribution of wood accumulations; ii) years with higher peak discharge correspond with greater channel dynamism and wood accumulations; and iii) all river corridor variables analyzed play a role in explaining river corridor spatial heterogeneity. Our results suggest that decadal-scale channel dynamism, as reflected in total sinuosity, corresponds to greater numbers of wood accumulations per surface area and greater persistence of these wood accumulations through time. Second, higher peak discharges correspond to greater values of wood distribution density, but not to greater channel dynamism. Third, persistent values of geomorphic heterogeneity, as reflected in the heterogeneity metrics of aggregation, interspersion, patch density, and evenness, are explained by potential predictor variables analyzed here. Our results reflect the complex interactions of water, sediment, and large wood in river corridors; the difficulties of interpreting causal relationships among these variables through time; and the importance of spatial and temporal analyses of past and present river processes to understand future river conditions
README: Data from: Interactions of wood accumulations, channel dynamics, and geomorphic heterogeneity within a river corridor
https://doi.org/10.5061/dryad.k0p2ngff3
All of our data was derived from publicly available data sources and is reproducible. We include the locations for data acquisition as well as scripts developed in Google Earth Engine for imagery acquisition/analyses and in R for calculation of spatial heterogeneity metrics.
Description of the data and file structure
This dataset contains a .xlsx file with the raw numerical data. There are three tabs within the .xlsx file.
Tab 1 "2013-2022" contains the data for each of our multi-year river corridor variables and each column represents a different variable. "Site" corresponds with which segment the data is associated with and does not have units; "Year" corresponds with which imagery year data is from and the units are year; "Wood_Dis_Dens" corresponds with the distribution density of wood accumulations and the number of wood accumulations/river corridor area; "Wood_Count" corresponds with the total number of wood accumulations and the units are number of logjams/segment; "Total_Sinuosity" corresponds with total sinuosity and the units are meters of active channel length/meters of valley length; "Migration" corresponds with average channel migration and the units are square meters of migration area/meters of migration length; and "Peak_Discharge" corresponds with the annual peak discharge and the units are cubic meters per seconds.
Tab 2 "Integrated time" contains the data that is not separated by year and only includes data for 2022. Once again, each column represents a different variable. In addition to the variables mentioned above, this dataset also includes the variables "beaver_meadow","sticky_sites", "aggregation", "interspersion", "patch_density", and "evenness". "beaver_meadow" corresponds with the proportion of beaver meadow area (sq. meters) to floodplain area (sq. meters); "sticky_sites" corresponds with the number of wood accumulations that persisted throughout the years of this study (count); "aggregation", "interspersion", "patch_density", and "evenness" are all spatial heterogeneity metrics and are measured as a % with the exception of patch density, which is measured as (patches/100 ha).
Tab 3 "Site Comparisons" contains data from 14 additional rivers in the U.S in addition to our site. None of the variables in this tab are new, but the column "Site" contains the names of the rivers used for comparison of heterogeneity metrics (aggregation, interspersion, patch density, and evenness).
A detailed description of how variables represented in each column were collected is included below.
For clarity, we have structured the description of data and data acquisition information by sections that correspond with the methods of the companion manuscript - Marshall et al. (2023).
Data for wood accumulations and beaver modifications
- This data was collected manually using Google Earth imagery. We conducted manual aerial wood accumulation surveys using available Google Earth imagery between 2013 and 2022 (four years of available imagery: 2013, 2016, 2020, 2022). We mapped all logjams that could be detected via the aerial imagery. Wood accumulations that were under canopy, too small for the spatial resolution of imagery, not interacting with base flows, or containing less than three visible wood pieces were not included. We recorded the number of wood accumulations per 2-km segment for each available imagery year as a minimum wood-accumulations count and divided the wood count by floodplain area for each segment to get the wood distribution density. We also noted the occurrence of persistent wood accumulations that were continually present in the Google Earth imagery, in what we refer to as “sticky sites”.
Data for channel dynamism and annual peak discharge
- To collect average channel migration data, we used a combination of Google Earth Engine and ArcGIS Pro. To measure active channel migration, we developed a semi-automated approach to map surface water extent and planimetric centerline movement (code included). Surface water extent was delineated for 2013, 2016, 2020, and 2022 to keep the timestep consistent with our wood surveys. We used publicly-available satellite imagery. Imagery collected for the National Agriculture Imagery Program (NAIP) was used when available (2013 and 2016). For 2020 and 2022, cloud-free multispectral composite images were created in Google Earth Engine from Sentinel-2 imagery from average baseflow months (August-October). Surface water was classified using the normalized difference water index (NDWI) for NAIP imagery, and modified normalized difference water index (MNDWI) in Sentinel-2 imagery. A unique threshold was empirically determined for each year to optimize the identification of the river surface while minimizing false-positive water identification, resulting in binary water and non-water masks for each year. Gaps and voids in the Sentinel-2 derived water masks (from shadow-covered areas, thin river segments, or mixed pixels along the river edge) were filled by sequentially buffering the water areas outwards by 30 meters (three pixels) and then inwards by 15 m. Similarly, gaps and voids in NAIP-derived water masks were filled using a sequential 20 m outwards then inwards buffer. The resulting binary water masks were imported into ArcGIS Pro (academic license) and vectorized. Manual adjustments were made to remove any remaining misclassified areas and join disconnected segments. We delineated centerlines of our channel masks in using the ArcGIS Pro Polygon to Centerline tool. When multiple channels were present, the dominant channel branch was chosen for the channel centerline. Consequently, our analysis represents a minimum value of channel migration during each time step because it does not include secondary channel movements. The Feature to Polygon tool was used to extract area differences between two centerlines at each segment. Areas between the centerlines for each segment were divided by centerline length to get a horizontal change distance.
- We measured total sinuosity manually in each 2-km segment for 2013, 2016, 2020, 2022 using Google Earth imagery and the built-in Measure tool in Google Earth (https://earth.google.com/web/). We measured total sinuosity as the ratio of total channel length of all active channels/valley length.
- We obtained annual peak discharge from the nearest US Geological Survey gauge (12370000, Swan River near Bigfork, MT).
Data for geomorphic heterogeneity
- Data included in the geomorphic heterogeneity remote sensing analysis included: Sentinel-2 imagery mosaic prepared in Google Earth Engine (code included), normalized difference vegetation index (NDVI) and normalized difference moisture index (NDMI) rasters calculated from the Sentinel-2 mosaic in ArcGIS Pro. The Sentinel mosaic was prepared for the approximate growing season in Montana, USA, (June 1 to October 31) based on available annual phenology activity curves (2018-2022) from the US National Phenology Network of the existence of leaves or needles on flowering plants.
- We performed an unsupervised remote sensing classification on a stack of data containing a 2022 Sentinel-2 imagery mosaic prepared in GEE, and normalized difference vegetation index (NDVI) and normalized difference moisture index (NDMI) rasters calculated from the Sentinel-2 mosaic in ArcGIS Pro. The ISO Cluster Unsupervised Classification ArcGIS Pro tool was used to perform the classification. Inputs to the tool were a maximum of 10 classes, a minimum class size of 20 pixels (tool default), and a sample interval of 10 pixels (tool default). The entire reach was classified once, and then clipped into individual 2-km segments.
Statistical analyses
- Statistical analyses were conducted in R (open source). We used an alpha (probability of rejecting the null hypothesis when the null hypothesis is true) of 0.05 in all statistical analyses.
- To understand the influence of time on our river corridor variables, we examined whether there was significant variation in the medians of river corridor variables between timesteps using a Kruskal-Wallis Rank Sum test. For any variable where there was a significant change between timesteps, we used a Dunn Test to determine exactly which groups were different. We also conducted the same exploratory statistical analysis to understand whether there was any significant variation in medians for each variable between segments.
- To understand the predictors of wood distribution density and wood count (hypothesis i), we ran a multiple regression model with total sinuosity, channel migration, and peak annual discharge as predictor variables. We performed a stepwise model selection with the stepAIC function from the MASS package to provide a relative indication of the quality of statistical model for our given set of data.
- To address hypotheses i and ii, we calculated both Pearson (r) and Kendall (τ) correlation coefficients. Given our small sample, we report both r and τ values. All correlation coefficients were calculated using the cor.test() function in base R. Hypothesis iii was further addressed through multiple linear regression models and stepwise model selection to determine whether proportion of beaver meadows, sticky sites, total sinuosity, channel migration, wood distribution density, or wood count are clear predictors of spatial heterogeneity metrics. We only used data from 2022 to keep a consistent timestep across all variables. Response variables include four spatial heterogeneity metrics: aggregation, interspersion, density, and evenness.
All calculated river corridor metrics are included as an .xlsx file.
Sharing/Access information
Publicly-available data used in this study can be found in the following locations:
NAIP and Sentinel-2 imagery - Google Earth Engine Code (see links in the Code/Software section)
Google Earth imagery- https://earth.google.com/web/
USGS gage data- https://waterdata.usgs.gov/monitoring-location/12370000/#parameterCode=00065&period=P7D&showMedian=true
Code/Software
Software Used
Google Earth Engine: https://code.earthengine.google.com/
R (version 4.2.3): open-source and available for download: https://www.r-project.org/
ArcGIS Pro (version 3.1): academic license needed
Google Earth Engine Code:
Channel migration and surface water mapping: https://code.earthengine.google.com/572a187e196bcd3d32fdb6b7d38003f9
Geomorphic heterogeneity imagery acquisition: https://code.earthengine.google.com/fa6632e24a67a5ba8c9d6e44f92016be
R
Code used to calculate heterogeneity metrics: uploaded as Swan_River_Analysis_landscapemetrics.R
Methods
This data was collected using field and remote sensing methods.
To provide spatial context for the measurements of wood distributions, geomorphic heterogeneity, and channel dynamism along our 32-km study reach, we segmented the study reach at uniform 2-km intervals prior to data collection. The downstream-most 8 segments were selected based on the naturalness of the river corridor and the presence of abundant large wood accumulations in the active channel(s). We focused on these segments for ground-based measurements. We subsequently expanded analyses to include an additional eight upstream segments. These segments were included because of anecdotal evidence of at least localized timber harvest in the river corridor, bank stabilization, and large wood removal from the active channel. We included these sites to provide a greater range of values within some of the variables analyzed and thus potentially increase the power of our statistical analyses.
Wood accumulations and beaver modifications
We conducted aerial wood accumulation surveys using available Google Earth imagery between 2013 and 2022 (four years of available imagery: 2013, 2016, 2020, 2022). We mapped all logjams that could be detected via the aerial imagery. Wood accumulations that were under canopy, too small for the spatial resolution of imagery, not interacting with base flows, or containing less than three visible wood pieces were not included. We recorded the number of wood accumulations per 2-km segment for each available imagery year as a minimum wood-accumulations count and divided the wood count by floodplain area for each segment to get the wood distribution density. We also noted the occurrence of persistent wood accumulations that were continually present in the Google Earth imagery, in what we refer to as “sticky sites”. GPS coordinates of wood accumulations were collected in the field during August 2022 to verify imagery identification.
We also manually identified active and remnant beaver meadows using Google Earth. Similar to large wood, American beaver (Castor canadensis) both respond to spatial heterogeneity in the river corridor (e.g., preferentially damming secondary channels) and create spatial heterogeneity through their ecosystem modifications. Beaver-modified portions of the river corridor (beaver meadows) were identified based on presence of standing water in ponds with a visible berm (beaver dam); different vegetation (wetland vegetation including rushes, sedges, and willow carrs that appear as a lighter green color in imagery) than adjacent floodplain areas; and detectable active or relict beaver dams (linear berms with different vegetation than adjacent areas). Several of the sites identified in imagery were also visited in the field to verify identification.
Channel dynamism and annual peak discharge
Channel dynamism was quantified using metrics of active channel migration and total sinuosity over time. To measure active channel migration, we developed a semi-automated approach to map surface water extent and planimetric centerline movement, which are commonly used to understand morphological evolution in rivers. We followed existing methodologies using base flow conditions as a conservative delineation of planimetric change given our goal of looking at relative channel change over time to understand which segments of our study area were the most dynamic.
Surface water extent was delineated for 2013, 2016, 2020, and 2022 to keep the timestep consistent with our wood surveys. Imagery collected for the National Agriculture Imagery Program (NAIP) was used when available (2013 and 2016). For 2020 and 2022, cloud-free multispectral composite images were created in Google Earth Engine (GEE) from Sentinel-2 imagery from average baseflow months (August-October). Surface water was classified using the normalized difference water index (NDWI) (Gao, 1996) for NAIP imagery, and modified normalized difference water index (MNDWI) in Sentinel-2 imagery. A unique threshold was empirically determined for each year to optimize the identification of the river surface while minimizing false-positive water identification, resulting in binary water and non-water masks for each year. Gaps and voids in the Sentinel-2 derived water masks (from shadow-covered areas, thin river segments, or mixed pixels along the river edge) were filled by sequentially buffering the water areas outwards by 30 meters (three pixels) and then inwards by 15 m. Similarly, gaps and voids in NAIP-derived water masks were filled using a sequential 20 m outwards then inwards buffer. The resulting binary water masks were imported into ArcGIS Pro and vectorized. Manual adjustments were made to remove any remaining misclassified areas and join disconnected segments.
We delineated centerlines of our channel masks using the ArcGIS Pro Polygon to Centerline tool. When multiple channels were present, the dominant channel branch was chosen for the channel centerline. Consequently, our analysis represents a minimum value of channel migration during each time step because it does not include secondary channel movements. The Feature to Polygon tool was used to extract area differences between two centerlines at each segment. Areas between the centerlines for each segment were divided by centerline length to get a horizontal change distance.
We measured total sinuosity in each 2-km segment for 2013, 2016, 2020, and 2022 using Google Earth imagery and the built-in Measure tool in Google Earth. We measured total sinuosity as the ratio of total channel length of all active channels/valley length.
We obtained annual peak discharge from the nearest US Geological Survey gauge (12370000, Swan River near Bigfork, MT). This site is below Swan Lake, a natural lake, into which the Swan River in our study area flows. Consequently, the gauge records reflect relative inter-annual fluctuations in peak discharge, but not actual discharge at the study site. We used annual peak discharge for the same time intervals used for analyzing channel position.
Geomorphic heterogeneity
We performed an unsupervised remote sensing classification on a stack of data containing a 2022 Sentinel-2 imagery mosaic prepared in GEE, and normalized difference vegetation index (NDVI) and normalized difference moisture index (NDMI) rasters calculated from the Sentinel-2 mosaic in ArcGIS Pro. The Sentinel mosaic was prepared for the approximate growing season in Montana, USA, (June 1 to October 31) based on annual phenology activity curves (2018-2022) of the existence of leaves or needles on flowering plants. The unsupervised classification was completed on the floodplain extent of the Swan, delineated manually in ArcGIS Pro using the 10-m 3DEP DEM, hillshade prepared from the DEM, Sentinel-2 imagery, and ArcGIS Pro Imagery basemap as visual references.
Although the classification is unsupervised, the classes were intended to represent distinct types of habitats within the river corridor that blend geomorphic features and vegetation communities as observed in the field, including, but not limited to: active channels, secondary channels, accretionary bars, backswamps, natural levees, old-growth forest, wetlands, and beaver meadows. The ISO Cluster Unsupervised Classification ArcGIS Pro tool was used to perform the classification. Inputs to the tool were a maximum of 10 classes, a minimum class size of 20 pixels (tool default), and a sample interval of 10 pixels (tool default). The entire reach was classified once, and then clipped into individual 2-km segments. The classified Swan raster was brought into R for statistical analysis of heterogeneity metrics. Data were visualized using the tidyverse and terra packages. All heterogeneity metrics were calculated using the landscapemetrics package using the Queen’s case.
Statistical analyses
Statistical analyses were conducted in R. The data we collected span different time intervals, and we conduct our statistical analyses to match the temporal and spatial scales of data we have for each of our hypotheses. We used an alpha (probability of rejecting the null hypothesis when the null hypothesis is true) of 0.05 in all statistical analyses.
To understand the influence of time on our river corridor variables, we examined whether there was significant variation in the medians of river corridor variables between timesteps using a Kruskal-Wallis Rank Sum test. For any variable where there was a significant change between timesteps, we used a Dunn Test to determine exactly which groups were different. We also conducted the same exploratory statistical analysis to understand whether there was any significant variation in medians for each variable between segments.
To understand the predictors of channel dynamism (hypothesis i), we created mixed-effect models with total sinuosity or average channel migration as response variables and logjam distribution density, count, persistence, and flow as fixed effects and segment as a random effect. We performed an AICc model selection, corrected for small sample size to provide a relative indication of the quality of statistical model for our given set of data. We included segment as a random effect in our mixed effect models to account for any potential spatial autocorrelation between segments.
To address hypotheses i and ii, we calculated both Pearson (r) and Kendall (τ) correlation coefficients. Given our small sample, we report both r and τ values. All correlation coefficients were calculated using the cor.test() function in base R. Hypothesis iii was further addressed through mixed effect models and AICc model selection to determine whether proportion of beaver meadows, sticky sites, total sinuosity, channel migration, wood distribution density, or wood count are clear predictors of patch density as a spatial heterogeneity metric (response variable). We only used data from 2022 to keep a consistent timestep across all variables. We included segment as a random effect in our mixed effect models to account for any potential spatial autocorrelation between segments.