Novel major loci shape habitat-associated flowering time variation in Yellowstone monkeyflowers
Data files
Dec 06, 2025 version files 8.41 MB
-
AHQ_soilMoisture_forDryad.txt
50.92 KB
-
AHQCanyon_12h_growout_forDryad.txt
16.95 KB
-
AHQgradient_tempscleansynched_forDryad.txt
6.04 MB
-
AHQsd_linkage_mapforDryad.txt
339.32 KB
-
AHQsd_QTL_map_forDryad.txt
178.01 KB
-
inbredLine_Daylength_screen_forDryad.txt
7.48 KB
-
README.md
7.50 KB
-
sd11_mafClust_WGS_med_stand_coverage_forDryad.txt
1.10 MB
-
sd6_inbredLines_genotypes_forDryad.txt
668.91 KB
Abstract
Plants harbor remarkable genetic diversity in flowering phenology, particularly in their responses to environmental cues such as photoperiod. Understanding the genetic basis of repeated evolution in flowering cues, which are key to reproduction, illuminates adaptation with gene flow and parallel evolution. We characterized variation in minimum critical daylength for flowering (MCD) in yellow monkeyflower (Mimulus guttatus) accessions from a geothermal soil mosaic in Yellowstone National Park, mapped loci underlying the most extreme MCD in focal thermal annuals, and investigated environmental variables shaping phenology in the field. Yellowstone monkeyflowers range in MCD from 12-15 hours, paralleling range-wide variation in M. guttatus; plants from thermal habitats flower under significantly shorter daylengths. Two QTLs govern the most extreme 12-hour MCD. Both contain candidates from gene families previously implicated in phenological evolution in monkeyflowers and other angiosperms, but the major loci appear novel. The frequency of 12-hour flowering across a microgeographic gradient is predicted by variation in soil temperature and the timing of dry-down. Adaptation to Yellowstone’s geothermal soil mosaic has generated dramatic evolution of flowering cues over short spatial scales. The genetic basis of 12-hour flowering does not indicate re-use of known M. guttatus alleles, but strong candidate genes nonetheless suggest molecular parallelism.
Dataset DOI: 10.5061/dryad.pvmcvdnxr
Description of the data and file structure
These files contain the data for experiments investigating the diversity, genomic basis, and environmental influences on minimum daylength for flowering in Yellowstone National Park (YNP) Mimulus guttatus.
- inbredLine_Daylength_screen_forDryad.txt contains information about which YNP inbred lines flowered under 12, 13, 14, and 15-hour days.
- AHQsd_linkage_mapforDryad and AHQsd_QTL_map_forDryad.txt contain the data from QTL mapping of 12-hour flowering using an AHQT x AHQN (thermal x nonthermal) F2 mapping population.
- sd6_inbredLines_genotypes_forDryad.txt contains genotypes from whole-genome sequenced inbred lines, within the strongest daylength QTL on Chr 6.
- sd11_mafClust_WGS_med_stand_coverage_forDryad.txt contains median-standardized coverage values for each site within a cluster of MADs affecting flowering transcription factors, which are within the second 12-hour flowering QTL on Chr 11.
- AHQCanyon_12h_growout_forDryad.txt contains data from a growout of wild-collected seeds in a gradient from highly thermal habitat to highly nonthermal habitat, under 12-hour days.
- AHQgradient_tempscleansynched_forDryad.txt contains soil and air temperature data in the field along quadrats in a thermal - nonthermal gradient.
- AHQ_soilMoisture_forDryad.txt contains soil moisture measurements in the field in the same quadrats as the temperature data.
Files and variables
File: sd11_mafClust_WGS_med_stand_coverage_forDryad.txt
Description: median-standardized coverage at each site in the cluster of MADs affecting flowering transcription factors on Chr 11.
Variables
- CHROM: chromosome. All genes are on Chromosome 11.
- POS: bp position on chromosome 11
- gene_name: Gene name from the AHQTv1 reference genome
- AHQN: inbred line whole genome sequence from nonthermal parent. The AHQN column shows median-standardized coverage for AHQN. at those sites.
- AHQT: inbred line whole genome sequence from thermal parent. The AHQT column shows median-standardized coverage for AHQT.
File: sd6_inbredLines_genotypes_forDryad.txt
Description: contains genotypes from whole-genome sequenced inbred lines, within the 12-hour flowering QTL on Chr 6.
Variables
- bp: base pair position on Chromosome 6, AHQTv1 reference genome
- individual: all other columns are individual inbred line. genotypes are coded as 0 for homozygous reference, 1 for heterozygous, and 2 for homozygous alternate.
File: AHQsd_QTL_map_forDryad.txt
Description: QTLCartographer mapping results for 12-hour flowering in an AHQTxAHQN F2 mapping population
Variables
- Chromosome: M. guttatus IM62v2 Chromosome Number (1 - 14)
- Marker: Marker number
- Position (cM): position in centimorgans
- columns D - U: QTLCartographer mapping results
File: inbredLine_Daylength_screen_forDryad.txt
Description: contains information about which YNP inbred lines flowered under 12, 13, 14, and 15-hour days.
Variables
- Light: hours of daylength (12, 13, 14, 15-hour days)
- Site-habitat: Habitat call (thermal annual or nonthermal perennial_
- Line: inbred line identity
- N Rows: number of replicates
- proportion_flowered: proportion of replicates that flowered
- canFlower: whether a line can flower under the specified daylength
File: AHQsd_linkage_mapforDryad.txt
Description: Linkage map from AHQT x AHQN (thermal x nonthermal) F2 mapping population. F2 ddRAD sequence were aligned to the M. guttatus IM62v2 reference genome.
Variables
- pos: chromosome_bp position on the M. gutattus IM62v2 reference genome.
- cm: centimorgan position of the marker
- remaining columns: individual F2 names. Flowered.yes.no indicates whether they flowered under 12-hour days. Genotypes are coded as A, H, B. Genotypes are coded as A, H, B. A = homozygous nonthermal genotype, H = heterozygote, B = homozygous thermal genotype.
File: AHQCanyon_12h_growout_forDryad.txt
Description: data from a growout of wild-collected seeds in a gradient from highly thermal habitat to highly nonthermal habitat, under 12-hour days.
Variables
- Plate: Number of 96-well flat the individual was planted in during the experiment.
- ID: identity of the maternal individual from which the seeds were collected in the greenhouse
- Flower_binary: whether the individual flowered under 12-hour days. A blank line means the individual died before flowering was recorded.
- Flowered yes/no: verbal description of whether individual flowered. “dead” means the individual died before flowering was recorded.
- Habitat: area in which the maternal individual was collected (Canyon, AHQT, or AHQN).
File: AHQgradient_tempscleansynched_forDryad.txt
Description: soil and air temperature data in the field along quadrats in a thermal - nonthermal gradient. Blank cells indicate that no temperature was recorded.
Variables
- Quad#: Number of the quadrat at which temperature data was collected
- Quadname: name of the quadrat at which temperature data was collected
- AHQlocation: area of the gradient at which temperature was collected (T_upper for AHQT, T_canyon for AHQ Canyon, or N_bog for AHQN
- Type: indicates wither plant temperature (at plant height) or soil temperature (beneath the soil)
- Date/Time: Time stamp from data logger in the format m/dd/yyyy h:mm:ss AM/PM
- Date: Date in format m/dd/yy
- Year: Year temperature was recorded
- Month: month temperature was recorded (1 – 12)
- Day: day of month temperature was recorded
- Time: time temperature was recorded (0 – 24)
- tempC: temperature in degrees Celsius
- am/pm: whether time recorded was AM or PM
File: AHQ_soilMoisture_forDryad.txt
Description: soil moisture measurements in the field in the same quadrats as the temperature data.
Variables
- year: year moisture data was collected
- month: month moisture data was collected (1 – 12)
- day: day of month moisture data was collected
- Sample Date: date sample was collected (mm/dd/yy)
- Quad#: number of the quadrat at which data was collected
- Quadname: name of quadrat at which data was collected
- T0: Mass of soaked toothpick (mg)
- T2: Mass of dried toothpick (Mg)
- Water: difference in mass between wet and dry toothpick
- Habitat: area of quadrat (N_bog = AHQN, Canyon = AHQCanyon, AHQT = AHQT)
- Max: proportion water saturation of toothpick (0 – 1), calibrated to maximum saturation. If T2 > T0, saturation was considered to be 1 (fully saturated)
Code/software
ddRAD sequence of AHQT x AHQN F2's were processed using the Fishman Lab genotype processing pipeline, described in detail here: https://www.protocols.io/view/processing-ddrad-data-from-raw-fastqs-to-vcf-q26g7bor3lwz/v1
Whole genome sequences were aligned with the package bwa mem with default parameters and genotypes with the package GATK-4.0.
Access information
Other publicly accessible locations of the data:
- All ddRAD sequence are available on the sequence read archive, PRJNA1051082
- All whole genome sequence data are available on the sequence read archive, PRJNA1050826
