Mean NDVI values from July 2017 to December 2023 for selected beaver complex sites and control sites along the Salinas River
Data files
Apr 28, 2025 version files 16.30 KB
-
data_nullExclude.csv
10.43 KB
-
ModifyHeadings.py
1.79 KB
-
NDVI_noDATA.py
1.11 KB
-
README.md
2.98 KB
Abstract
This study used remotely sensed mean normalized difference vegetation index (NDVI) data, which is a measure of vegetative greenness, to serve as a primary indicator of resilience. Mean NDVI for riparian vegetation was collected in 3 beaver complexes and 5 non-beaver developed sections along the Salinas River in California over a time span of July 2017 until December 2023.
https://doi.org/10.5061/dryad.v41ns1s4t
This dataset contains remotely sensed mean normalized difference vegetation index (NDVI) data, which is a measure of vegetative greenness on a scale of -1 to 1. Mean NDVI for riparian vegetation was collected in 3 beaver complexes and 5 non-beaver developed sections along the Salinas River in California over a time span of July 2017 until December 2023. Mean NDVI for each site at each available date was derived from Landsat C2 U.S. Analysis Ready Data (ARD) on the USGS EarthExplorer website.
Description of the data and file structure
Each row in the dataset represents a study site. Provided below is a list of the columns and their descriptions.
OBJECTID: A unique numeric value automatically assigned to the site polygons that were created in ArcGIS Pro.
Type: Specifies whether a site is beaver influenced (complex) or has no beaver influence (control_
UniqueID: Special codes were user-assigned to each site where beaver complexes had the same prefix “BW” followed by a unique number, and the control sites had the same prefix “C” followed by a unique numeric value.
AREA: Calculated area in square meters of the site polygon.
Shape_Length: Default column in ArcGIS Pro containing polygon perimeter measurement in meters.
Shape_Area: Default column created in ArcGIS Pro containing polygon area in square meters.
CID: When I used the geoprocessing tool in ArcGIS Pro to create 100 random sample points within polygons, the CID column was created to link the points to the polygon they were bound in.
FREQUENCY: The number of sample points within each polygon.
mean_{date}: Rest of the columns follow this format with "mean_" as the prefix, followed by a unique numeric suffix representing a date. The {date} is an 8-digit integer, where the first four digits represent the year, the next two digits represent the month, and the last two digits represent the day of the month. For each date listed in the suffix of the column, there is a mean NDVI value associated with a site.
For clarification, OBJECTID and CID are the same. OBJECTID was created first when I made polygon features. CID was linked to this OBJECTID in the creation of random sample points. I could then join information from one table to another based on this shared field, despite the different nomenclature. There is no significance to being named differently other than the distinction of which table it comes from. Values are nonconsecutive due to human error. There were times when polygons were deleted so I could create more accurate outlines. Also, in two cases, I merged polygons representing beaver complexes due to proximity, where it would have been difficult to discern the influences of each complex. Thus, the numbers are not sequential.
Landsat Imagery was collected using the United States Geological Survey’s (USGS) web tool Earth Explorer (“EarthExplorer,”). Earth Explorer is an open-source tool that only requires a USGS account, which is a free process. In the search criteria tab, the date range was refined for the study’s desired time frame, which was 2017-06-30 to 2023-12-31. Cloud coverage was also modified so that only images with 10% or less cloud cover would be generated. In the data sets tab, Landsat C2 U.S. Analysis Ready Data (ARD) was selected. Using the spacecraft identifier dropdown menu under the additional criteria tab, I selected Landsat 8 and 9, which include the most recently launched satellites producing imagery with a resolution of 30 meters. The satellites have global coverage and revisit the same spot every 8 days. To get Landsat imagery that encompassed the entirety of the study area, the tile grid horizontal and vertical were set to 2 and 10, respectively, under the additional criteria tab. The fill (no data) dropdown menu from the additional criteria is also enabled for the specification of results with less than 10% of no data. Sometimes data is missing from captured images, so this option ensures that my search only produces images where less than 10% of the tiles have no data values. The results tab rendered an image with the desired parameters. Band 4 (red) and Band 5 (near infrared) .tif files were downloaded for each image, as these are the necessary bands for normalized difference vegetation index computation. To calculate NDVI from the downloaded Landsat imagery, a Python notebook within ArcGIS Pro was created. Images taken from the same date had the same prefix; thus, a script was needed to loop through the files and conduct the NDVI computation on bands 4 and 5 from the same date of collection. The output of this script was NDVI .tif files for each date of collection. NDVI files were also re-projected to a consistent coordinate reference system. Another Python notebook was used to alter the NDVI raster set so that any negative values were considered as no data or null. Using “Create Random Points” in geoprocessing tools for ArcGIS Pro, a feature class of 100 points was randomly placed within each site polygon. Next, the tool “Extract Multi Values to Points” was used to calculate the daily NDVI for each point. Finally, using the “Summary Statistics” tool in the geoprocessing toolkit for ArcGIS Pro, the average daily NDVI was extracted for each site from the initial point sample values. For each site and time frame, there were on average 98.9 points of the 100 used to derive the mean NDVI. This tool skipped null values in the computation of the mean. The exclusion of null values ensured that open water, which can have negative values, was not used in the computation of the mean NDVI. Each image has a distinct date of collection; thus, the date was employed to distinguish data headings. This process was run using another Python notebook. The join field tool, as a batch, was used to join the mean NDVI data to the “Sites” feature class attribute table. The input table was the “Sites” feature class, and the join field was the “ObjectID” column. The summary statistic table served as the join table, and the join table field was “CID.” Other parameters were kept on the default settings, and the tool ran successfully. The delete field tool was then used to clean up the “Sites” attribute table and remove irrelevant columns. Fields which were kept included ObjectID, Type, Location, UniqueID, AREA, and the statistic that followed the format MEAN_{date}.
