TuMV infection in Hesperis matronalis across Kingston, Ontario
Data files
Oct 09, 2025 version files 271.40 KB
-
Fitness_data.csv
40.38 KB
-
Infection_between_sites.csv
90.95 KB
-
README.md
5.79 KB
-
Within_stand_density.csv
134.28 KB
Abstract
Understanding the factors that influence pathogen prevalence is essential to mitigating the negative consequences of disease. Infection prevalence should be influenced by the abundance and distribution of the pathogen’s hosts, yet tests of this general expectation from natural populations are few. Furthermore, human activity is profoundly altering species distributions, which may have consequences for the pathogens that are associated with them. We investigated whether urbanization influences infection prevalence by, in part, affecting host population size and density in a study that surveyed the occurrence of turnip mosaic virus (TuMV) infection across 132 populations of the invasive mustard plant Hesperis matronalis in Ontario, Canada, along an urban-rural gradient. We scored TuMV infection by the appearance of overcolor breaking for a total of ∼38,000 plants across three generations. Overall, 39% of populations included at least one infected individual, and 10% of individuals were infected within these populations. As predicted, the probability of population infection increased with human activity, even after controlling for the positive effect of population size. Larger populations in areas of high human activity were also more likely to remain infected across generations. The effect of human activity on the infection frequency within populations was less consistent. Within populations, the probability of individuals being infected increased with local density of conspeci cs, yet mean density did not influence population infection or infection frequency within infected populations. Our results highlight how urbanization can influence the prevalence of infection due to an economically important plant virus.
Dataset DOI: 10.5061/dryad.5x69p8dj1
Description of the data and file structure
We surveyed naturalized populations of the plant Hesperis matronalis over three years along an urban-rural gradient throughout Kingston, ON, Canada. We used night sky brightness (NSB) as a proxy for human activity. At each population, we estimated the prevalence of TuMV infection by scoring the colour morph of an individual as either normal (uniform white, pink, or violet) or colour broken. We estimated the reproductive census size as the total number of flowering individuals (N). We also counted the number of flowering stems within a 1-m radius (D ) of every third plant sampled for infection status during 2022 and 2023. In 22 populations surveyed in 2021, we also randomly tagged plants, scored them as infected or not, and then harvested the aboveground portion of the plant to estimate correlates of fitness.
Files and variables
File: Infection_between_sites.csv
Description: Survey data of TuMV infection prevalence in populations of Hesperis matronalis surveyed from 2021-2023.
Missing values are denoted as NA
Variables
- order: row number
- sitecode: Code for the location where the sample was taken. We know the GPS coordinates of these sites
- gen_location: Road name for each population that was surveyed.
- date_sampled: Date data was collected
- year: Year data was collected
- three_yrs: Binary variable describing whether a site was surveyed in all 3 sampling years (1= yes, 0 = no)
- fate: Describes the number of years a site was surveyed (s3y = sampled 3 years, ns = not sampled, inacc = not accessible)
- lon: Longitude
- lat: Latitude
- N: Estimated population size
- W: Number of plants with white colour morph
- LP: Number of plants with light pink colour morph
- DP: Number of plants with dark pink morph
- V: Number of plants with white violet morph
- S: Number of plants with colour-broken (infected) morph
- morph_freqs: Binary variable describing if morph frequencies were counted (yes or no)
- other_mustards: Binary variable describing if other mustard species were detected at the population
- site_distinctness: Binary variable describing if a site was a distinct cluster of plants (1 = yes, 0 = no)
- notes: Notes column
- SQM: Night sky brightness measured in Sky Quality Meteres
- dist_cc_km: Distance of population to Kingston city centre (city hall) in km
- n: Number of plants surveyed
- fW: Frequency of white morph
- fLP: Frequency of light pink morph
- fDP: Frequency of dark pink morph
- fV: Frequency of violet morph
- nwS: Number of plants that are not infected
- fS: Frequency of infected plants
- Sgt0: Binary variable stating if population had infected plants detected or not (0 = no, 1 = yes)
- log10_N: Log10-transformed population size
- negSQM: Negative values of Sky Quality metres
- med_den: Median density of a population
- mlog10_den: Mean of log10-transformed density of conspecific plants within a 1 m radius
- mden: Mean density of conspecific plants within a 1 m radius
File: Within_stand_density.csv
Description: Density data of individual Hesperis matronalis around a given focal plant measured across populations.
Variables
- year: Year data was collected
- sitecode: Code for the location sample was taken. We know the GPS coordinates of these sites
- morph: Describes colour morph of plant that was measured (W = white, LP = light pink, DP = dark pink, V = violet, S = infected/colour-broken)
- density_pmr: number of conspecific plants within a 1 m radius of a focal plant
- dens_pm2: number of conspecific plants within a 1 m radius of a focal plant
- broken: Binary variable describing whether the focal plant was colour-broken (i.e. infected), where 1 = yes, 0 = no.
File: Fitness_data.csv
Description: Fitness correlates of* Hesperis matronalis* plants measured in 2021
Variables
- order: Row number
- year: Year data was collected
- sitecode: Code for the location where the sample was taken. We know the GPS coordinates of these sites
- plant: the ID that was on the tag of the plant. Unique for the location, but not across locations
- plant_id: Sitecode and plant combined for a unique ID
- colour_morph: The colour of the plant's flowers. V is violet/purple, LP is light pink, DP is dark pink, W is white, and S is sectored/multicoloured/variegated
- total_stem_mass_g: Mass in g of everything above the first inflorescence stem ( main stem, fruit, seeds, inflorescence stems)
- fruit_number: Total number of developed fruit on the plant
- flower_number: Fruit number + aborted fruit number
- ppn_parasitized: Proportion of fruit for an individual that had dark marks or frass
- mseptumindents: Number of potential seeds (the number of indentations in the outer layer of fruit that were made by developing seeds)
- fruit_set: Number of fruit that successfully set seed
- total_seeds: The number of seeds
- ppn_seedsdestroyed: Proportion of seeds that were destroyed
- log10_total_stem_mass_g: Log10-transformed stem mass
- log10_fruit_number: Log10-transformed fruit number
- log10_total_seeds: Log10-transformed total seed number
- infected: Binary variable describing whether a plant was infected (yes or no)
- N: Population size
- fW: Frequency of white colour morph at that population
- fP: Frequency of pink colour morph at that population
- fV: Frequency of violet colour morph in that population
- NSB: Night Sky Brightness (negative Sky Quality Metres)
Code/software
All analyses were performed using the R statistical environment (ver. 4.4.1) run in RStudio (ver. 2024, 09.01375; Posit Software, Boston, MA).
