Range-wide study in a sexually polymorphic wild strawberry reveals climatic and soil associations of sex ratio, sexual dimorphism, and sex chromosomes
Data files
Apr 17, 2025 version files 7.65 MB
Abstract
The contemporary environment, which influences local resource pools and mate access, is rapidly changing in the Anthropocene, posing unique challenges for sexually polymorphic plants. A landscape scale understanding of climate and soil drivers of sex-specific factors can help predict how global change will impact these species.
Using ~7,000 herbarium and iNaturalist specimens we determine how sex ratio, sexual dimorphism, and sex chromosomes vary with geographic, climatic and soil gradients in Fragaria virginiana and whether these conform to predictions from theory.
Sex ratio was hermaphrodite/male-biased and was driven more by soil attributes than climatological ones. Sex ratio-environment associations matched predictions for subdioecious species in the West but for gynodioecious species in the East. Climatic, not soil factors, affected sexual dimorphism in traits related to carbon acquisition but not mate access (petal size and flowering time). Greater sexual dimorphism was due to one sex being more responsive (females for leaf length and hermaphrodite/male for runnering while flowering) to precipitation or temperature. Sex chromosome variation was biased towards the ancestral type and frequencies varied with different environmental factors between East-West regions.
A landscape-level perspective of environmental drivers of sex-specific factors provides insight into how anthropogenic disturbance may impact sexually polymorphic species.
https://doi.org/10.5061/dryad.zcrjdfnmc
- The contemporary environment, which influences local resource pools and mate access, is rapidly changing in the Anthropocene, posing unique challenges for sexually polymorphic plants. A landscape scale understanding of climate and soil drivers of sex-specific factors can help predict how global change will impact these species.
- Using ~7,000 herbarium and iNaturalist specimens we determine how sex ratio, sexual dimorphism, and sex chromosomes vary with geographic, climatic and soil gradients in Fragaria virginiana and whether these conform to predictions from theory.
- Sex ratio was hermaphrodite/male-biased and was driven more by soil attributes than climatological ones. Sex ratio-environment associations matched predictions for subdioecious species in the West but for gynodioecious species in the East. Climatic, not soil factors, affected sexual dimorphism in traits related to carbon acquisition but not mate access (petal size and flowering time). Greater sexual dimorphism was due to one sex being more responsive (females for leaf length and hermaphrodite/male for runnering while flowering) to precipitation or temperature. Sex chromosome variation was biased towards the ancestral type and frequencies varied with different environmental factors between East-West regions.
- A landscape-level perspective of environmental drivers of sex-specific factors provides insight into how anthropogenic disturbance may impact sexually polymorphic species.
Description of the data and file structure
Data are organized into three data files that are used to conduct the analyses shown in the manuscript. All data files include standard collection and housekeeping metadata (sample IDs, collection date, database ID/catalog number and image URL) as well as spatial and environmental predictors (latitude, longitude, region, soil nitrogen, soil bulk density, winter temperature mean, summer temperature mean, winter precipitation mean and summer precipitation mean). We additionally include the shape file used to generate figures containing mapped records.
-
Cullen_etal_2024_MFscores_metadata_final_03-04-24_NA_Updated.csv contains all data used for analyses of sex ratio based on hermaphrodite/male or female scores, flowering time using collection date, and presence of runners. To conduct the runner analysis, the data is subsetted to only herbarium records using a line of code in the R script. Columns names and descriptions are as follows:
Column Name Description db_sample_id Sample identification code formatted as DATABASE_PROJECTNAME_IDNUMBER (e.g., iDigBio_FSH2022_0001). database The database that the record was downloaded from. catalog_number The museum catalog number associated with the sample. This column is blank for iNaturalist records. institutionCode Code indicating the herbarium storing the record. This column is blank for iNaturalist records. access.URL URL link to access digitized image of herbarium record or iNaturalist record. All images are maintained on servers that the authors of this manuscript are not responsible for, so we cannot guarantee availablilty of all images. inaturalist_ID Unique ID for in-house record keeping of iNaturalist records. This column is blank for herbarium records. record.type Whether record was downloaded from an herbarium database or iNaturalist male_fertility Scored fertility of flower, recorded as sex (either Female [not male fertile] or Male/Hermaphrodite [male fertile]). m.f.binary Sex score as binary for use in binomial GLMER. 0 = male/hermaphrodite, 1 = female. eventDate Date in the form of YYYY-MM-DD. elevation Elevation in meters. Elevation values estimated with package elevatr. col.year Collection year of the record. collection_doy Collection day of year = flowering day of year. 365 day of year score (1 = January 1st, 365 = December 31st). Calculated from lat Latitude (decimal) lon Longitude (decimal) longitude_bins Longitude bins = Region. Records were "binned" into regions east and west of 102°W. runners Yes (Y) or No (N) as to whether plants had produced runners during the year they were collected. mean_tmp_coldest_Q Mean temperature of the coldest quarter = winter temperature (°C). Mean calculated for month of collection mean_pre_coldest_Q Mean precipitation of the coldest quarter = winter precipitation (mm/month) mean_tmp_warm_Q Mean temperature of the warmest quarter = summer temperature (°C) mean_pre_warm_Q Mean precipitation of the warmest quarter = summer precipitation (mm/month) bdodmean_5_15 Bulk Density (of the fine earth fraction) in kg/dm^3 estimated for the 5-15cm depth range. More details of soil metrics at https://www.isric.org/explore/soilgrids/faq-soilgrids. nitrogenmean_5_15 Nitrogen concentration in g/kg estimated for the 5-15cm depth range. More details of soil metrics at https://www.isric.org/explore/soilgrids/faq-soilgrids. -
Cullen_etal_2024_Petal-Leaflet_measures_metadata_final_03-04-24_NA_Updated.csv contains all data used for analyses of petal and leaflet length dimorphism. These data constitute a subset of the "Cullen_etal_2024_MFscores_metadata_final_03-04-24_NA_Updated.csv" dataset with the addition of petal and leaflet measurements, so all other columns can be identically interpreted. Columns names and descriptions are as follows:
Column Name Description db_sample_id Sample identification code formatted as DATABASE_PROJECTNAME_IDNUMBER (e.g., iDigBio_FSH2022_0001). database The database that the record was downloaded from. catalog_number The museum catalog number associated with the sample. This column is blank for iNaturalist records. institutionCode Code indicating the herbarium storing the record. This column is blank for iNaturalist records. access.URL URL link to access digitized image of herbarium record or iNaturalist record. All images are maintained on servers that the authors of this manuscript are not responsible for, so we cannot guarantee availablilty of all images. male_fertility Scored fertility of flower, recorded as sex (either Female [not male fertile] or Male/Hermaphrodite [male fertile]). m.f.binary Sex score as binary for use in binomial GLMER. 0 = male/hermaphrodite, 1 = female. elevation Elevation in meters. Elevation values estimated with package elevatr. eventDate Date in the form of YYYY-MM-DD. col.year Collection year of the record. lat Latitude (decimal) lon Longitude (decimal) longitude_bins Longitude bins = Region. Records were "binned" into regions east and west of 102°W. petal_length Length of petal from base to tip in mm. petal_width Width of petal at widest point, measured perpendicular to midvein in mm. leaflet_length Length of longest central leaflet from base to tip in mm. petal_width Width of longest central leaflet at widest point, measured perpendicular to midvein in mm. bdodmean_5_15 Bulk Density (of the fine earth fraction) in kg/dm^3 estimated for the 5-15cm depth range. More details of soil metrics at https://www.isric.org/explore/soilgrids/faq-soilgrids. nitrogenmean_5_15 Nitrogen concentration in g/kg estimated for the 5-15cm depth range. More details of soil metrics at https://www.isric.org/explore/soilgrids/faq-soilgrids. mean_tmp_coldest_Q Mean temperature of the coldest quarter = winter temperature (°C). Mean calculated for month of collection mean_pre_coldest_Q Mean precipitation of the coldest quarter = winter precipitation (mm/month) mean_tmp_warm_Q Mean temperature of the warmest quarter = summer temperature (°C) mean_pre_warm_Q Mean precipitation of the warmest quarter = summer precipitation (mm/month) -
Cullen_etal_2024_SDRhaplotypes_metadata_final_03-04-24_NA_Updated.csv contains all data used for analyses of sex determining region haplotypes. Columns names and descriptions are as follows:
Column Names Description pcr_sample_id Sample identification code. Codes from Tennessen et al. 2018 follow different formatting. record.type Whether sample tissue was obtained from an herbarium record, the National Clonal Germplasm Repository (NCGR), or was based on haplotyping from Tennessen et al., 2018. eventDate Date in the form of YYYY-MM-DD. col.year Collection year of the record. Derived from eventDate catalog_number catalog number from herbarium. Not all records had catalog numbers due to updating from catalog to barcode numbers. db_sample_id ID of sample if it could be linked to the digital dataset. Many herbarium records were sampled in-person thus could not be directly linked to samples in the digital dataset. lat Latitude (decimal) lon Longitude (decimal) longitude_bins Longitude bins = Region. Records were "binned" into regions east and west of 102°W. elevation Elevation in meters. Elevation values estimated with package elevatr. inferred_SDR_clade SDR Haplotype (clade) as inferred from PCR assays. a.bg.binary Binary SDR haplotypes grouped into either Alpha = 0, or Beta/Gamma = 1. Used in GLMER logistic regression. bdodmean_5_15 Bulk Density (of the fine earth fraction) in kg/dm^3 estimated for the 5-15cm depth range. More details of soil metrics at https://www.isric.org/explore/soilgrids/faq-soilgrids. nitrogenmean_5_15 Nitrogen concentration in g/kg estimated for the 5-15cm depth range. More details of soil metrics at https://www.isric.org/explore/soilgrids/faq-soilgrids. mean_tmp_coldest_Q Mean temperature of the coldest quarter = winter temperature (°C). Mean calculated for month of collection mean_pre_coldest_Q Mean precipitation of the coldest quarter = winter precipitation (mm/month) mean_tmp_warm_Q Mean temperature of the warmest quarter = summer temperature (°C) mean_pre_warm_Q Mean precipitation of the warmest quarter = summer precipitation (mm/month) Missing data code: NA
-
harvard_na_shp is a folder containing various shape files used to generate map figures and test for spatial autocorrelation. This folder cannot be renamed or it will throw an error when attempting to read the shape file (bound_p.shp) into R.
Further descriptions of data collection and processing are described in the manuscript and associated supplementary information.
Sharing/Access information
This is a section for linking to other ways to access the data, and for linking to sources the data is derived from, if any.
Data was derived from the following sources:
- Digitized images and associated collection records:
- iDigBio
- GBIF
- Consortium of Intermountain Herbaria
- Consortium of Pacific Northwest Herbaria
- iNaturalist
- Elevational data was derived from the elevatr package in R
- Climate data (temperature and precipitation) was derived from the CRU TS database
- Soil data was derived from the SoilGrids database
Code/Software
We coded analyses and figure generation in R. Raw data filtering and transformations were not included in this version of the script as they are lengthy and can be intuitively recreated using a working knowledge of the dplyr and tidyverse packages. We specifically include an R markdown file, used primarily for organizational purposes, NOT for producing knitted output (executing all code in the script and outputting the results to a word, pdf or html file). Model selection procedures can take several hours to run, thus we caution against running all of the code at once. Instead we suggest running code line-by-line only running the necessary pieces for your inquiry.
Briefly, we downloaded digitized, imaged herbarium records of Fragaria virginiana from online herbarium databases (iDigBio, GBIF, the Consortium of Pacific Northwest Herbaria, and the Consortium of Intermountain Herbaria) and iNaturalist records. We filtered out all digital records which were missing geolocation coordinates (latitude, longitude) or a collection date. We scored imaged records for flower sex, flowering time (collection date), presence of collection-year runners, and measured petal and leaf length and width on a subset of herbarium records. We additionally used targeted primers to characterize the sex determining region (SDR) haplotypes on a subset of herbarium records, which we had sampled in-person and requested from specific herbaria. To associate sex ratio, petal/leaflet phenotype, flowering time (from collection date), runner presence and SDR haplotypes with spatial and climate variables we gathered data on elevation, temperature and precipitation (from the CRU time series dataset), and soil nitrogen and bulk density (from the SoilGrids dataset). Herbarium records were binned into east and west regions using a dividing meridian of 102° W (natural gap in the continent-wide range). Temperature and precipitation averages were estimated for the coldest (winter) and warmest (summer) months for each record, averaging the monthly temperature for the month of collection and the two preceeding months for 10 years prior to the collection year of the record (10-year pre-collection climatological means). Together these data were used to conduct the analyses presented in the linked manuscript.
- Cullen, Nevin; Richardson, Ethan; Budinsky, Trezalka et al. (2025). Range-wide study in a sexually polymorphic wild strawberry reveals climatic and soil associations of sex ratio, sexual dimorphism, and sex chromosomes. Zenodo. https://doi.org/10.5281/zenodo.10783544
- Cullen, Nevin; Richardson, Ethan; Budinsky, Trezalka et al. (2025). Range-wide study in a sexually polymorphic wild strawberry reveals climatic and soil associations of sex ratio, sexual dimorphism, and sex chromosomes. Zenodo. https://doi.org/10.5281/zenodo.10783543
- Cullen, Nevin; Richardson, Ethan; Budinsky, Trezalka et al. (2025). Range‐wide study in a sexually polymorphic wild strawberry reveals climatic and soil associations of sex ratio, sexual dimorphism and sex chromosomes. Journal of Ecology. https://doi.org/10.1111/1365-2745.70056
