Data Archival for Economic Cost Modeling of Chinook Habitat Restoration in the Stillaguamish River Basin
Data files
May 23, 2024 version files 334.23 MB
-
elevation.zip
53.99 MB
-
final_results.zip
1.21 MB
-
HARP.zip
3.34 MB
-
land_use.zip
243.33 MB
-
POC_WA.zip
8.95 MB
-
Poverty_WA.zip
8.95 MB
-
README.md
13.40 KB
-
roads.zip
4.71 MB
-
Tribal_Jurisdiction_Sno.zip
770.49 KB
-
Unemployment_WA.zip
8.95 MB
Abstract
We used geospatial data to model economic cost estimates of habitat restoration in the Stillaguamish River Basin in the Puget Sound. We utilized data pertaining to the streams/rivers, floodplain habitat, subbasins, elevation, distance to roads, demographics, and land use within the Stillaguamish River Basin to do so. Analysis included using the different attributes of the Stillaguamish River Basin to create low and high cost estimates for floodplain, engineered log jam, and riparian planting habitat restoration. We specifically looked at the slope and size of streams, area of habitat that needed to be restored, slopes of the riparian area, distance to nearest road, and canopy angles as our model inputs. We followed cost estimate guidance provided by the Puget Sound Shared Strategy to identify our cost ranges and updated them to todays prices using the producer price index. An additional land use analysis was performed to quantify the total area and cost of potential agricultural land in the basin. Lastly, we investigated the demographics of the region to identify areas of POC and low income in relation to proposed restoration actions.
This archive includes data and code for the project “Prioritizing Chinook Salmon Habitat Restoration for Southern Resident Killer Whale Recovery” for the Master of Environmental Science and Management: Master’s Group Project (2024). Refer to the supplemental information link for details.
Here is a brief summary of the main data sets we used and created in our study. This summary contextualizes how we used the data, but one should refer to the final report’s “Methods” section for a full look into how the data was used. Analyses and annotated data manipulation are provided in the full GitHub repository scripts. All data are publically available with locations sourced below and located in the parent directory labeled data
. Some data sets are included in the repository but not used for the analyses.
Stream line (STL) - a folder containing a geopackage that has shapefiles for stream reaches within the Stillaguamish River Basin. This data was used in our analysis to generate cost estimate model inputs pertaining to the terrain of the streams in a given subbasin (such as its slope and width). These parameters were used throughout the 3 habitat restoration action scenarios as they all required stream reach attributes as input for the cost estimates.
Floodplain (STL) - a folder containing a geopackage that has shapefiles of floodplain habitat in the Stillaguamish River Basin. Polygons were used in the analysis of quantifying the total floodplain habitat in each subbasin and the amount that needed to be restored. If a hab_unit
was labeled as curr
, then it is a current floodplain habitat. If labeled hist
then it was historical. If labeled both
then it used to be floodplain habitat and it still is. We used this to determine whether or not restoration was needed and how much. This data was not used in assessing engineered log jams or riparian planting actions.
NOAA subbasins (STL) - a folder containing a geopackage that has shapefiles of all of the subbasins within the Stillaguamish River Basin. Polygons were used in all aspects of analysis to crop floodplain habitat, streams, elevation, land use polygons, and other physical characteristics of the basin to individual subbasins. All cost estimates were created at a subbasin level using this data set. It was also used to visualize our final results in map making. Overall, this data was used to segment our analysis into individual subbasins that we then used to compare to onea nother.
Elevation - A folder containing DEM files of elevation raster data for the Stillaguamish River Basin. Raster data is shown at a 10m resolution across the basin. It was cropped to the basin and used in the riparian planting analyses to estimate the steepness of terrain and difficulty of accessing riparian vegetation. The raster data was cropped to each stream reach within each subbasin where there was proposed riparian planting restoration.
land_use - a folder containing a geopackage that has shapefiles on the land use of land parcels throughout Snohomish county. These parcels were clipped to the Stillaguamish River Basin and the polygons were intersected with those of each subbasin. We used this to estimate the total area and percent land uses throughout the basin and identify areas of existing agricultural land that intersect with historical floodplain habitat that could potentially be utilized for restoration.
benefits - a folder containing a CSV file with the estimated increases in Chinook, Steelhead, and Coho salmon from each of the 3 different restoration actions by subbasin. It highlights the current population and the modeled change in population following restoration of the subbasin to historical conditions as modeled by the HARP model.
roads - a folder containing a geopackage that has shapefiles of all roads in Stillaguamish County including public roads, country roads, park roads and national forest roads. This data was used to measure distance of stream reaches to the nearest road in the engineered log jams and riparian planting analyses.
POC_WA - a folder containing a geopackage that has the total people and number of people of color by census tract.
Poverty_WA - a folder containing a geopackage that has the total people and number of people living under the poverty line by census tract.
Unemployment_WA - a folder containing a geopackage that has the total people and number of unemployed people by census tract.
Tribal_Jurisdiction_Sno - a folder containing a geopackage that has shapefiles of Indian Trust Lands, Pending Trust Lands, and Fee Simple Lands owned by Tribal Members or Tribal Associations.
final_results - a folder containing a geopackage that has has the final costs and benefits resulting from each proposed action for each applicable subbasin. Subbasins not included implies that there was no change in Chinook population from the resulting action.
Key Variables by Data Set
Streamline (STL) - Key variables used were:
noaaid
Unique reach identifier
Habitat Description
of reach habitat type. Large and small non-tidal streams are reclassified by width within the HARP model
Area_km2
Area of catchment draining to reach (National Elevation Dataset, NED) (km2)
slope
Stream gradient (National Elevation Dataset, NED) (m/m)
BF_width
Bankfull width, modeled by NOAA (m)
length
Reach length (m)
fpw
Width of floodplain at reach (WDNR LiDAR and National Elevation Dataset, NED) (m)
can_ang
Current canopy opening angle, modeled by NOAA, NOAA riparian condition dataset (°)
hist_ang
Historical canopy opening angle, modeled by NOAA, NOAA riparian condition dataset (°)
geometry
line strings of the streams
Floodplain (STL) - Key variables used were:
HabUnit Code
indicating habitat type
Period Code
indicating time period in which feature exists or existed
Hab_cond
Code indicating whether feature appears to be natural in origin, manmade, or natural in origin with human modification
noaaid
Numeric code of nearby reach
Area_ha
Area of feature ha
geometry
Polygons of the floodplain habitat
Subbasins (STL) - Key variables used were:
noaa_subba
Name of subbasin
geometry
Polygons of the subbasins
Elevation - Key variables used were:
elevation
elevation of the raster pixel (m)
land_use - Key variables used were:
MASTER_CAT
category of land use
geometry
polygons of the land use parcels
benefits - Key variables used were:
pop
the population of salmon
subbasin
name of the subbasin
scenario
proposed intervention or restoration action
n
modeled population following restoration
n_curr
modeled current population
perc_change
percent change in population of salmon
roads - Key variables used were:
geometry
polygons of roads
POC_WA - Key variables used were:
percent_people_of_color
percent people of color
Poverty_WA - Key variables used were:
percent_living_in_poverty
percent of population living in poverty
Unemployment_WA - Key variables used were:
percent_unemployed
percent of population unemployed
all_cost_benefit - Key variables used were:
noa_sbb
subbasin wihtin the stillaguamish river basin
ttl_lw_
total lower cost estimate
ttl_pp_
total upper cost estimate
ttl_vg_
total average cost estimate
pop
salmon species being considered
n
modeled population following restoration
n_curr
modeled current population
prc_chn
percent change in population of salmon
n_diff
difference in population before and after intervention
cb_rati
cost effectiveness ratio
rstrtn_
proposed restoration action
sbbsn_n
clean subbasin name
geometry
polygons of the subbasins
File Structure
All data is stored in subfolders underneath the Data
folder in the repository. Relevant folders include:
elevation
which includes elevation data
HARP
which includes all of the Flowline_STL
, Floodplain_STL
, Subbasins_STL
, and benefits
data
roads
which includes all the public road data
land_use
which includes all land use data in Snohomish county
POC_WA
which includes all of the data pertaining to demographics
Poverty_WA
which contains data pertaining to poverty rates
Unemployment_WA
which contains unemployment data
Tribal_Jurisdiction_Sno
which contains tribal land data for Snohomish County
final_results
contains the all_cost_benefit
data and our final results
These data area primarily raw data folders pulled from the specified online sources. However, as annotated in the scripts, certain data frames were read in, modified, and re-written to minimize computation and time spent on the analyses. Other data folders may not have a specified purpose as they were used in unrelated analyses or just used in the exploration of the analyses.
Sharing/Access information
This is a section for linking to other ways to access the data, and for linking to sources the data is derived from, if any.
Links to other publicly accessible locations of the data:
Data and data description for Streamline (STL), Floodplain (STL), and Subbasins (STL) data are publicly available at:
Data and data description for the Elevation data are publicly available at:
https://gis.ess.washington.edu/data/raster/tenmeter/byquad/info.html
Data and data description for the land_use data are publicly available at:
Data and data description for the benefits data can be requested from: Tim Beechie, Supervisory Research Fish Biologist, Northwest Fisheries Science Center, NOAA tim.beechie@noaa.gov{.email}
Data and data description for the roads data are publicly available at:
https://geo.wa.gov/datasets/a12a43c5b10b498ca6612454616bc7fa/about
Data and data description for the POC_WA data are publicly available at:
Data and data description for the Poverty_WA data are publicly available at:
https://geo.wa.gov/datasets/WADOH::population-living-in-poverty-current-version/about
Data and data description for the Unemployment_WA data are publicly available at:
https://geo.wa.gov/datasets/67c699681b4f49c0adb1b5cada9e1919_0/explore
Data and data description for the Tribal_Jurisdiction_Sno data are publicly available at:
https://www.arcgis.com/home/item.html?id=3bcd4cdf822440e7a84e46b8f0b27fba&sublayer=0#overview
Data and data description for the final_results data were created in this analysis and stored in this repository
Code/Software
All analyses were performed in R Studio using Version 2023.12.1+402. Annotated code, scripts, final products, and data are provided in the the following GitHub repository:
https://github.com/ramhunte/gp_anadromites
Notes on the Analysis:
Analyses and code are annotated throughout the scripts explaining the data wrangling, cleaning, and analyses process. The repository contains mainly RMD files with annotated code and some sourced R files as well for common functions and data sources used throughout the analyses. Some files were written in R and generated into specified subfolders. Raw data is stored in the Data
folder, common functions and data read in throughout the analyses are written in the common.R
file, and the scripts
folder contains all of our working analyses divided into actions (floodplain (cost_floodplain.Rmd
), engineered log jams (cost_elj.Rmd
), and riparian planting (cost_rp.Rmd
)) as well as demographic analysis (demographic_overlap.Rmd
), land use (landuse.Rmd
). Our figures were constructed in the figures.Rmd
file, and a common functionsR
script was used to source common functions across the analyses in the scripts folder. benefit_data.Rmd
wrangles and generates HARP model benefits (increased number of Chinook) that are used throughout the analyses, and the cost_data.Rmd
reads in and wrangles the costs associated with land use and agriculture. Note that the variables in the raw data files look different than in the individual analyses as names were modified in the common.R
file.
All data was collected from a variety of different publicly available datasets that were collected since 1999. Individual analyses for floodplain, engineered log jams, and riparian planting habitat restoration were completed in R Studio through a mix of spatial analysis and cost modeling. The data was all processed as csv (tabular), DEM (elevation raster), and shp (spatial) files. Data was manipulated from the raw data in the analysis and documented in code in R Studio. All scripts used for the analyses were either R Markdown or R files. We primarily used base R, tidyverse, sf, and other spatial packages for our data manipulation and analysis. Figures and final products were created in the analysis and included in the GitHub repository and associated code for reproducibility.