A pathogen’s spatial range is not constrained by geographical features in the flax rust pathosystem
Data files
Sep 27, 2023 version files 32.05 MB
-
all_populations.RDS
-
all_transects.RDS
-
BC_cleaned.gpx
-
BG_cleaned.gpx
-
CB_cleaned.gpx
-
CL_cleaned.gpx
-
CM_cleaned.gpx
-
CT_cleaned.gpx
-
DC_cleaned.gpx
-
ER_cleaned.gpx
-
GL_cleaned.gpx
-
HM_cleaned.gpx
-
landscape_transects.csv
-
LG_cleaned.gpx
-
MB_cleaned.gpx
-
ME_cleaned.gpx
-
NP_cleaned.gpx
-
OBJ_cleaned.gpx
-
README.md
-
RG_cleaned.gpx
-
RL_cleaned.gpx
-
SH_cleaned.gpx
-
TC_cleaned.gpx
-
TR_cleaned.gpx
-
transect_meta_data.csv
-
UL_cleaned.gpx
-
VB_cleaned.gpx
-
WM_cleaned.gpx
-
WS_cleaned.gpx
Abstract
In this study, we performed several transect surveys over the course of the 2021 summer field season to assess potential ecogeographical range determinants for Lewis flax (Linum lewisii) and its pathogen, flax rust (Melamspora lini), in the area surrounding the Rocky Mountain Biological Laboratory in Gothic, Colorado. Additionally, we used generalized additive models to examine the effects of host population density and metapopulation structure on disease presence and prevalence.
README: A pathogen’s spatial range is not constrained by geographical features in the flax rust pathosystem
Access this dataset on Dryad: https://doi.org/10.5061/dryad.fbg79cp23
Contact Information:
Corresponding Author: Keenan Duggal (keenanduggal@gmail.com)
Principal Investigator: Jessica Metcalf (cmetcalf@princeton.edu)
Overview:
The purpose of this study was to characterize ecogeographical features of the environment that influence the spatial distribution of plants and plant pathogens. For this project, the sub-alpine plant, Lewis flax, and its fungal pathogen, flax rust, were chosen as the model system. Transect surveys were performed over the course of the 2021 summer field season in the area surrounding the Rocky Mountain Biological Laboratory (Colorado). This repository contains 1) raw data describing the precise routes of each transect survey and the locations of associated flax populations, 2) summarized data that has collated ecogeographical information for each transect, and 3) information about how to access the R scripts used to generate the summarized data and perform each statistical analysis.
Description of the data and file structure
Data Key:
RG | GL | WM | TR | NP | OBJ |
---|---|---|---|---|---|
Rustler Gulch<br>Trail | Green Lake<br>Trail | West Maroon <br>Pass Trail | Treasury <br>Mountain | North Pole <br>Basin | Oh Be Joyful <br>Trail |
BG | CM | BC | CL | LG | HM |
---|---|---|---|---|---|
Baxter <br>Gulch Trail | Cinnamon <br>Mountain | Brush Creek + <br>Twin Lakes | Copper <br>Lake Trail | Lupine <br>+ Gunsight | High Meadow <br>(Trail #403 Ext.) |
CT | DC | UL | RL | ER | CB |
---|---|---|---|---|---|
Caves Trail | Deer Creek <br>Trail | Upper Loop <br>Trail | Red Lady <br>Trail | East River<br>Trail | Mt. Crested Butte |
SH | VB | WS | ME | MB | TC |
---|---|---|---|---|---|
Strand Hill <br>Trail | Virginia Basin | Warm Springs <br>Trail #406 | Mt. Emmons | Mt. Belview | Mt. Teocalli |
Raw Data:
- "landscape_transects.csv"
- This file contains the location and date of observation for each flax population, along with host density, disease prevalence and disease presence information.
- Columns:
- transect: Abbreviated transect code (refer above to data key)
- date: Date of observation
- start.lat: Latitude of transect start (format = decimal degrees)
- start.long: Longitude of transect start (format = decimal degrees)
- end.lat: Latitude of transect start (format = decimal degrees)
- end.long: Longitude of transect start (format = decimal degrees)
- num.H: Total number of healthy flax plants within the 25m2 survey area of population
- num.D: Total number of diseased flax plants within the 25m2 survey area of population
- presence: Binary presence/absence of disease within entire flax population
- "transect_meta_data.csv"
- This file contains the summarized metadata for each transect.
- Columns:
- transect: Name of hiking / mountain biking trail used for transect line
- tag: Abbreviated transect code (refer above to data key)
- start.lat: Latitude of transect start (format = decimal degrees)
- start.long: Longitude of transect start (format = decimal degrees)
- end.lat: Latitude of transect start (format = decimal degrees or physical description of landmark)
- end.long: Longitude of transect start (format = decimal degrees or physical description of landmark)
- date: Date of transect observations
- gpx.file: Name of associated transect.gpx file
- strava.link: Link to raw GPX tracks recorded on Strava
- Missing Data Code: "NA"
- Applicable for end.lat and end.long which were extracted from the appropriate .gpx file in the cases where a numerical value was not accessible.
- Files ending in "_cleaned.gpx"
- These files contain the precise path information for each transect survey. They are derived from the raw data found in the strava.links (see: transect_meta_data.csv). The algorithm used to process these files is described below:
- File Cleaning Algorithm:
- Open .gpx file in GPX editor
- Zoom in all the way with 'Apple Standard' view
- Trace path from beginning
- Whenever an irregular pattern (apparent deviation from a normal path) occurs: a) Switch view to 'Apple Satelite' (this will zoom map out a bit) b) Check to see if deviation is real based on satellite imagery c) If it is real, delete points
- Delete all waypoints backtracking from end of transect to start
- Save file as x.cleaned.gpx
- File Cleaning Algorithm:
- These files contain the precise path information for each transect survey. They are derived from the raw data found in the strava.links (see: transect_meta_data.csv). The algorithm used to process these files is described below:
Summarized Data:
"all_transects.RDS"
- Description: Aggregated eco-geographical information for the surveyed area in each transect
Type: Data.frame containing 502,597 observations of 9 variables:
Columns (data class): Description: transect (character) Transect Code chunk (numeric) Location used for raster extraction flax.presence (numeric) Binary plant presence/absence incidence (numeric) Binary disease presence/absence elevation (numeric) Altitude (meters) slope (numeric) Slope grade (degrees) slope_southness (numeric) Degree of 'southerness' [range(-1, 1)] slope_westness (numeric) Degree of 'westerness' [range(-1, 1)] landcover (numeric) Modal landcover category<br>(Number Code Key: <br>https://www.rmbl.org/scientists/resources/data-catalog/data-catalog-entry/?catalog-id=95)
"all_populations.RDS"
- Description: Aggregated eco-geographical, plant density and epidemiological information for the subset of the surveyed area in each transect containing flax
Type: Data.frame containing 306 observations of 19 variables:
Columns: Description: transect (character) Transect Code chunk (numeric) Location used for raster extraction density (numeric) Total number of plants num.H (numeric) Number of healthy plants num.D (numeric) Number of disease plants incidence (numeric) Binary disease presence/absence nearest.pop.dist (integer) Distance to nearest population (meters) nearest.D.pop.dist (integer) Distance to nearest diseased population (meters) elevation (numeric) Altitude (meters) slope (numeric) Slope grade (degrees) slope.southness (numeric) Degree of 'southerness' slope.westness (numeric) Degree of 'westerness' mode.landcover (numeric) Modal landcover category <br>(Number Code Key: <br>https://www.rmbl.org/scientists/resources/data-catalog/data-catalog-entry/?catalog-id=95) p.landcover.1 (numeric) Percentage of chunk containing landcover category 1 p.landcover.2 (numeric) Percentage of chunk containing landcover category 2 p.landcover.3 (numeric) Percentage of chunk containing landcover category 3 p.landcover.4 (numeric) Percentage of chunk containing landcover category 4 p.landcover.5 (numeric) Percentage of chunk containing landcover category 5 p.landcover.6 (numeric) Percentage of chunk containing landcover category 6
Sharing/Access information
The raster datasets used to collate eco-geographical data for each transect can be found on the RMBL Spatial Data Platform:
https://www.rmbl.org/scientists/resources/data-catalog/. Specifically, the following raster datasets were used:
- Topography: https://rmbl-sdp.s3.us-east-2.amazonaws.com/data_products/released/release3/UG_dem_1m_v1.tif
- Slope Grade: https://rmbl-sdp.s3.us-east-2.amazonaws.com/data_products/released/release3/UG_dem_slope_1m_v1.tif
- Slope Southness: https://rmbl-sdp.s3.us-east-2.amazonaws.com/data_products/released/release3/UG_dem_aspect_southness_1m_v1.tif
- Slope Westness: https://rmbl-sdp.s3.us-east-2.amazonaws.com/data_products/released/release3/UG_dem_aspect_westness_1m_v1.tif
- Landcover: https://rmbl-sdp.s3.us-east-2.amazonaws.com/data_products/released/release3/UG_landcover_1m_v4.tif
Additionally, a temperature raster dataset was used from the PRISM Climate Group which can be accessed at the following location: https://prism.oregonstate.edu/normals/.
- Time Frame = 1991 - 2020
- Spatial Resolution = 800m
- Climate Variable = Mean Temperature
- Temporal Period = Annual Values
Code/Software
Statistical analyses and data wrangling were performed in R. Scripts have been uploaded and are housed in the following Github repository: https://github.com/ianfmiller/flax.rust/tree/main/landscape.transmission.dynamics
- "load and plot data.R" contains the R script used to turn the individual GPX files and landscape transect survey data into the summarized "all_populations.RDS" and "all_transects.RDS" files
- "manuscript2.0.Rmd" contains the script containing all statistical analyses and figure-creation code
Each script includes detailed line-by-line comments to facilitate understanding.
Methods
The methodology used for collecting this data was purely observational. Geographic coordinates bounding each flax population were recorded alongside measurements of disease presence and prevalence. Later, eco-geographical data associated with each population was extracted from online raster datasets (https://www.rmbl.org/scientists/resources/spatial-data-platform/) and collated in R.