Data from: Fluctuating selection in a Monkeyflower hybrid zone
Data files
Dec 10, 2024 version files 1.26 MB
-
CF2021.csv
218.57 KB
-
CF2023.csv
48.06 KB
-
EC_jdate.csv
91.64 KB
-
EC2021PCA.csv
13.44 KB
-
EC2023jdate_edited.csv
224.76 KB
-
ECBOTH.csv
46.63 KB
-
FieldSites.csv
126 B
-
HH2021.csv
188.07 KB
-
HH2023.csv
47.50 KB
-
hybridseeds2023.csv
12.78 KB
-
Leaves2021.csv
116.87 KB
-
README.md
10.11 KB
-
Seeds2021.csv
15.97 KB
-
survcurv23.csv
29.69 KB
-
TK2021v2.csv
151.39 KB
-
TK2023.csv
44.46 KB
Abstract
While hybridization was viewed as a hindrance to adaptation and speciation by early evolutionary biologists, recent studies have demonstrated the importance of hybridization in facilitating evolutionary processes. However, it is still not well-known what role spatial and temporal variation in natural selection play in the maintenance of naturally occurring hybrid zones. To identify whether hybridization is adaptive between two closely related monkeyflower species, Mimulus guttatus and Mimulus laciniatus, we performed repeated reciprocal transplants between natural hybrid and pure species’ populations. We planted parental genotypes along with multiple experimental hybrid generations in a dry (2021) and extremely wet (2023) year in the Sierra Nevada, CA. By taking fine scale environmental measurements, we found that the environment of the hybrid zone is more similar to M. laciniatus’s seasonally dry rocky outcrop habitat than M. guttatus’s moist meadows. In our transplants hybridization does not appear to be maintained by a consistent fitness advantage of hybrids over parental species in hybrid zones, but rather a lack of strong selection against hybrids. We also found higher fitness of the drought adapted species, M. laciniatus, than M. guttatus in both species’ habitats, as well as phenotypic selection for M. laciniatus-like traits in the hybrid habitat in the dry year of our experiment. These findings suggest that in this system hybridization might function to introduce drought-adapted traits and genes from M. laciniatus into M. guttatus, specifically in years with limited soil moisture. However, we also find evidence of genetic incompatibilities in second generation hybrids in the wetter year, which may balance a selective advantage of M. laciniatus introgression. Therefore, we find that hybridization in this system is both potentially adaptive and costly, and that the interaction of positive and negative selection likely determines patterns of gene flow between these Mimulus species.
README: Fluctuating selection in a Monkeyflower hybrid zone
Diana Tataru
Last edited: Dec 4, 2024
https://doi.org/10.5061/dryad.k98sf7mg4
This is a dataset corresponding to a replicated reciprocal transplant experiment conducted in 2021 and 2023 in Yosemite National Park, CA with Mimulus laciniatus, M. guttatus, and experimental hybrids (first generation, second generation, and back-crossed). Methods and analyses are published in Evolution Letters:
Diana Tataru, Max De Leon, Spencer Dutton, Fidel Machado Perez, Alexander Rendahl, Kathleen G Ferris, Fluctuating selection in a monkeyflower hybrid zone, Evolution Letters, 2024;, qrae050, https://doi.org/10.1093/evlett/qrae050
Description of the data and file structure
FieldSites.csv is a simple csv with latitude and longitude coordinates for the three experimental field sites where transplants occured.
2021 Data:
1. HH2021.csv, CF2021.csv, TK2021v2.csv are the three phenotypic data collection sheets from the field data in 2021. HH is the hybrid site, CF is the parental M. guttatus site, and TK is the parental M. laciniatus site. Sites were surveyed approximately every three days. Empty cells in the datasheet indicate positions where a plant was not planted due to limited germinants, or where a plant died before data could be collected.
*Site = Site that data was collected from (HH, CF, or TK)
*Block = Block of 36 plants, of which there were 100 at each site
*Position = Location of plant in the 36-plant blocks
*Plant = Type of plant (L= Hybrid zone M. laciniatus, G= Hybrid zone M. guttatus, F1= First generation hybrid, F2= second generation hybrid, BCG= Back-crossed F1 to HHG, BCL= Back-crossed F1 to HHL)
*death = Date the plant was identified as dead
*firstfl = Date of the first flower of the plant
*planting date = Date the germinant was planted at the respective site
*daysto = number of days between planting and first flower
*fruit = Total number of fruits a plant produced
*flwidth = width of the first flower in millimeters (mm), taken as the widest part of the mouth of the corolla
*height = height at first flower (mm), taken from base at soil to apical meristem
*herb = date that a plant showed any signs of herbivory
*stigma = length of the stigma of the first flower (mm)
*anther = length of the longest anther of the first flower (mm)
*stanthsep = length of the stigma minus length of the longest anther (mm)
*cleist = Y/N whether a plant exhibited cleistogamy
*bud = date that the first bud was seen
*Tissue = Y/N whether flower tissue was collected, in CF2021 this column is NT, which stand for No Tisuue
*BE = Y/N, stands for "bud eaten" whether there was herbivory on a flower bud
*decap = Y/N whether a plant was accidently decapitated in measurements (in TK2021v2 this is recored in Notes)
*BD = Y/N/, stands for "bud dead", if a flower bud was formed but died before maturation (in TK2021v2.csv this is recorded in Notes, does not exist for CF2021.csv)
*Notes = additional notes (does not exist for CF2021.csv)
2. Seeds2021.csv is the data collection sheet with seed number counted from individuals collected in the three sites in 2021. Site, Block, and Position correspond to the same values in datatsheets (1)
* Seed Count #1 = number of seeds produced by an entire plant at that respective Site/Block/Position combination
3. Leaves2021.csv is the data collection sheet of leaf area and lobing index calculated using the program ImageJ. Each leaf has two rows from the analysis, and the second row has lobing and area values for analysis. Site, Block, and Position correspond to the same values in datatsheets (1) and are merged with those datasheets in the R script. The first row for each site/block/position combination is the values for the raw leaf and the second row is the same measurements for the convex hull
*Collection Date = date the leaf was colected in MDDYYY
*Area = Area of the leaf in square pixels
*Mean = calculations from image J not used in the analysis
*Min = calculations from image J not used in the analysis
*Max = calculations from image J not used in the analysis
*Leaf lobing = measurement calculated as (convex hull area-raw area/convex hull area)
*Notes = additional notes
4. EC_jdate.csv is the fine-scale environmental data taken in 2021, with Site and Plot corresponding to Site and Block in datasheets (1)
*uniqueblock = combined Site + Plot
*Collection.Date= Date that environmental data was collected in M/DD/YY
*Light.Levels = Light levels in micromoles per square meter per second
*Soil.Moisture = soil moisture, measured in milliVolts (mV) to the same depth in every block
*Surface.Temp = soil surface tempature in Farenheight, measured with a laser thermometer one foot from the ground
*Time= Time that data was collected
*propsurv = proportion of plants that were still surviving in that block at that date
*Numbersurv = total number of plants that were still surviving in that block at that date
*totalinplot = total numnber of plants planted in the block
*Number Died = Number of plants that died in that block up to that date
*jdate= Julian Dates calculated such that January 1=1, January 2=2, etc.
5. EC2021PCA.csv is the soil moisture data from (4) in milliVolts, binned by weekly measurement (one per site), with each week as a different variable and soil moisture value for each block as the rows. This data sheet was created in excel after preparation in R in dataset cleaning (hybridreciprocaltransplant_final.R) and exporting. Empty columns are due to repeated measures in some weeks at some sites, while other sites only had one measure. When repeated measures occur, the value used for the analysis (labelled Week#) is a mean of the measurements within those binned dates (columns (#,#]TWO/ONE have raw values if this is the case] calculated in excel.
2023 Data:
1. HH2023.csv, CF2023.csv, TK2023.csv are the three phenotypic data collection sheets from the field data in 2023. HH is the hybrid site, CF is the parental M. guttatus site, and TK is the parental M. laciniatus site. Sites were surveyed approximately every three days. Empty cells in the datasheet indicate positions where a plant was not planted due to limited germinants (common for AF1), or where a plant died before data could be collected.
*Block = Block of 18 plants, of which there were 75 at each site
*Position = Location of plant in the 18-plant blocks
*Plant = Type of plant (HHL= Hybrid zone M. laciniatus, HHG= Hybrid zone M. guttatus, F1= First generation hybrid, F2= second generation hybrid, BCG= Back-crossed F1 to HHG, BCL= Back-crossed F1 to HHL, TKL= parental site M. laciniatus, CFG= parental site M. guttatus, AF1= first generation hybrid between TKL and CFG, excluded from analysis due to small sample)
*Planting Date = Date the germinant was planted at the respective site
*Death Date = Date the plant was identified as dead
*Date 1st Fl = Date of the first flower of the plant
*Fruit # = Total number of fruits a plant produced
*Seed# = Total number of seeds a plant produced (this column is also merged with (hybridseeds2023.csv)
*Herbiv? = Y/N whether a plant showed any signs of herbivory
*Initials (COMPUTER INPUT) = technician who input data from physical sheets into the computer
*Notes = additional notes
2. hybridseeds2023.csv is the data collection sheet with seed number counted from individuals collected in the three sites in 2023. Site, Block, and Position correspond to the same values in datatsheets (1), and this data sheet is merged with (1) in the R script so as Seed Count #1 fills in the Seed# in (1).
* Data Collecter #1 = the person who counted the seeds
* Seed Count #1 = number of seeds produced by an entire plant at that respective Site/Block/Position combination
3. EC2023jdate_edited.csv is the fine-scale environmental data taken in 2023 weekly at each Block, with Site and Block corresponding to Site and Block in datasheets (1). Empty cells in the datasheet indicate blocks where all plants died and where removed the week prior.
*Collection.Date= Date that environmental data was collected
*Intials= Technician recording the data
*Time= Time that data was collected
*Soil.Moisture..mV. = soil moisture, measured in milliVolts (mV) to the same depth in every block
*Surface.Temp..F. = soil surface tempature in Farenheight, measured with a laser thermometer one foot from the ground
*Light.Levels..umol.m.2.s.1. = Light levels in micromoles per square meter per second
*Data.Entered.By= technician who input data from physical sheets into the computer
*NOTE= additional notes
*jdate= Julian Dates calculated such that January 1=1, January 2=2, etc.
*propsurv = proportion of plants that were still surviving in that block at that date
*numbersurv = total number of plants that were still surviving in that block at that date
*numberdied = Number of plants that died in that block up to that date
*totalinplot = total numnber of plants planted in the block
4. survcurv23.csv is the soil moisture data from datasheet(3) in milliVolts, binned by weekly measurement at each block/Site combination, with each week as a different variable and soil moisture value for each block as the rows. This data sheet was created in excel after preparation in R in dataset cleaning (hybridreciprocaltransplant_final.R) and exporting.
5. ECBOTH.csv is a combined data sheet of EC2021PCA.csv and survcurv23.csv. Due to the much longer growing season in 2023, survcurv23.csv contains soil moisture values for more weeks. Those weeks at the end of the season are filled in with 0s for 2021, to represent dry conditions.
Code/Software
All data sheets above are analyzed in the R script hybridreciprocaltransplant_final.R, in R version 4.2.1.
Annotations are provided throughout the script in sections for 1) library loading, 2) dataset loading and cleaning, 3) analyses and models, and 4) figures.