Skip to main content
Dryad

Crossing data and genomic data for: Strong postmating reproductive isolation in Mimulus section Eunanus

Cite this dataset

Farnitano, Matthew (2024). Crossing data and genomic data for: Strong postmating reproductive isolation in Mimulus section Eunanus [Dataset]. Dryad. https://doi.org/10.5061/dryad.7wm37pvzc

Abstract

Postmating reproductive isolation can help maintain species boundaries when premating barriers to reproduction are incomplete. The strength and identity of postmating reproductive barriers are highly variable among diverging species, leading to questions about their genetic basis and evolutionary drivers. These questions have been tackled in model systems but are less often addressed with broader phylogenetic resolution. In this study we analyze patterns of genetic divergence alongside direct measures of postmating reproductive barriers in an overlooked group of sympatric species within the model monkeyflower genus, Mimulus. Within this Mimulus brevipes species group, we find substantial divergence among species, including a cryptic genetic lineage. However, rampant gene discordance and ancient signals of introgression suggest a complex history of divergence. In addition, we find multiple strong postmating barriers, including postmating prezygotic isolation, hybrid seed inviability, and hybrid male sterility, leading to complete or substantial postmating isolation in all species pairs. Hybrid seed inviability appears linked to differences in seed size, providing a window into possible developmental mechanisms underlying this reproductive barrier. While geographic proximity and incomplete mating isolation may have allowed gene flow within this group in the distant past, strong postmating reproductive barriers today are likely to prevent any ongoing hybridization. By producing foundational information about reproductive isolation and genomic divergence in this understudied group, we add new diversity and phylogenetic resolution to our understanding of the mechanisms of plant speciation.

README: Crossing data for 'Strong postmating reproductive isolation in Mimulus section Eunanus.'


The data in this repository represent data related to the assessment of genomic relationships within selected members of Mimulus (syn. Diplacus) section Eunanus, as well as postmating reproductive barriers between Mimulus brevipes, Mimulus fremontii, Mimulus johnstonii, and Mimulus sp. 'Sespe Creek.'

Phenotypic data were collected using plants grown in growth chambers at the University of Georgia, using a combination of wild-collected seeds and seeds produced by hand-pollination of growth-chamber-grown individuals. Data were collected by Matthew Farnitano between 2019 and 2022.

SNP data and phylogenetic trees were produced from raw illumina sequencing data (separately archived at the NCBI Sequence Read Archive, accession PRJNA922914), using a subset of the same individuals for which phenotypic data were collected. Data were sequenced at the Duke University Center for Genomic and Computational Biology and processed at the University of Georgia using the Georgia Advanced Computing Resource Center cluster. Data processing details are provided in the associated manuscript.

Description of the data and file structure

Raw data files included in this repository:

  1. Crosses_seed_counts.csv
  2. Seed_sizes_measured.csv
  3. Pollen_counts.csv
  4. Eunanus_allsamples.complete.vcf.gz
  5. Eunanus_allsamples.SNPs.RAxML.tre
  6. Eunanus_allsamples.SNPs.NJ.tre
  7. Eunanus_allsamples.genes.ASTRAL.tre

File descriptions

'Crosses_seed_counts.csv' contains counts of viable and inviable seeds from hand-pollination crosses. Seeds were scored by eye as viable or inviable based on shape (plumpness, filled vs. empty or shriveled). Each line represents an individual fruit which is the result of one attempted hand pollination. Fruits which failed to produce any seeds after pollination are included with counts of 0 for both viable and inviable seeds; these are analyzed under "crossing success" in the paper. Some very small particles were assumed to be unfertilized ovules and were not included in either inviable or viable seed counts; some of these may have been seeds that aborted very early.

Variables included in 'Crosses_seed_counts.csv':

  • Batch: Which of three growth chamber grow-outs these seeds came from. 
  • Maternal: The ID of the maternal (seed) parent of the seeds counted. The format of the ID is Population_MaternalFamily_Individual. Population codes starting with BF indicate F1 hybrids between M. brevipes and M. fremontii; otherwise the starting letter indicates species, B=M. brevipes, F=M. fremontii, J=M. johnstonii, S=M. 'Sespe Creek'. See supplemental information in the associated article for details about each Population and F1 line. An -X- in the ID indicates self-fertilization, with the following number indicating the offspring resulting from that self-fertilization. For example, S25_1_2-X-6 is the sixth offspring from a self-fertilization of individual S25_1_2. Population JX2 is derived from an intraspecific outcross between different families of population J31.
  • Paternal: The ID of the paternal (pollen) parent of the seeds counted. The format is the same is for Maternal.
  • Mat_family: The population or hybrid line that the maternal parent of the cross came from (see Maternal for details).
  • Pat_family: The population or hybrid line that the paternal parent of the cross came from (see Maternal for details).
  • Mat_Species: The species of the maternal parent of the cross. BxF indicates the maternal parent was an F1 hybrid between M. brevipes and M. fremontii
  • Pat_Species: The species of the paternal parent of the cross. BxF indicates the maternal parent was an F1 hybrid between M. brevipes and M. fremontii
  • Cross_type: categorization of the type and direction of the cross, indicating maternal and paternal species. Format is Maternal X Paternal. H indicates an F1 hybrid between M. brevipes and M. fremontii
  • Good seeds: the number of seeds that were counted and categorized as 'good' (e.g., assumed to be viable). Scoring is by eye based on plumpness, size, and shape. 
  • Shriveled seeds: the number of seeds that were counted and categorized as 'shriveled' (e.g., assumed to be inviable). Again, scoring is by eye. All seeds were categorized as either 'good' or 'shriveled'. 
  • Total seeds: the sum of good and shriveled seeds, equal to the total number of seeds produced by this fruit. 

'Seed_sizes_measured.csv' contains seed length and width data for a subset of seeds from a subset of crosses. For majority 'viable' crosses, only viable seeds were measured; for majority 'inviable' crosses, only inviable seeds were measured. Therefore this data represents a comparison of inviable hybrid seeds to their viable parental counterparts. For fruits with many seeds, a random subset of seeds from the fruit were measured. Measurements were made by photographing seeds under a dissecting scope and measuring lengths in the software imageJ relative to a standard length. Length is the distance between the two furthest points of the seed (in mm) , while width is the largest distance perpendicular to the length measurement (in mm).

Variables included in 'Seed_sizes_measured.csv':

  • Maternal: The ID of the maternal (seed) parent of the measured seeds. See description of 'Crosses_seed_counts.csv' for formatting details. 
  • Paternal: The ID of the paternal (pollen) parent of the measured seeds. The format is the same is for Maternal.
  • Mat_family: The population or hybrid line that the maternal parent of the cross came from (see Maternal for details).
  • Pat_family: The population or hybrid line that the paternal parent of the cross came from (see Maternal for details).
  • Mat_species: The species of the maternal parent of the cross, abbreviated. BREV=M. brevipes, FREM=M. fremontii, JOHN=M. johnstonii, SESP=M. 'Sespe Creek'
  • Pat_species: The species of the paternal parent of the cross, abbreviated (same format as Mat_species). 
  • Cross_type: categorization of the type and direction of the cross, indicating maternal and paternal species. Format is Maternal X Paternal.
  • Fruit_ID: identifier of the fruit within a particular cross type. Individual seeds with the same fruit ID and the same Cross type are from the same fruit. 
  • Seed_ID: identifies the seed within a particular fruit and cross type. 
  • Length: distance in mm between the two furthest points of the seed, measured manually in imageJ from a microscopy photograph.
  • Width: largest distance (in mm) across the seed perpendicular to the length measurement, measured manually in imageJ from a microscopy photograph. 

'Pollen_counts.csv' contains counts of viable and inviable pollen after staining with aniline blue. Pollen was counted using a hemocytometer under a dissecting scope. Squares_counted refers to the number of 1mm squares of the hemocytometer that were counted; each square represents a volume of 0.1mm^3 = 0.1uL out of a total sample volume of 50uL for each flower. All anthers from a single flower were placed in a 50uL sample.

Variables included in 'Pollen_counts.csv':

  • Flower: unique identifier of the flower from which pollen was collected. If only a single flower was collected from an individual, this is just the individual ID. If multiple flowers were collected from one individual, _flowerX or _fX is used to distinguish each flower, where X is a unique number. 
  • Group: The species or hybrid type of the individual from which pollen was collected*.* BxF indicates an F1 hybrid between* M. brevipes* and* M. fremontii*. 
  • Population_Line: The population or hybrid line of the individual that pollen was collected from. BF2, BF3, and BF5 represent independent F1 hybrid crosses from which multiple seeds were derived. Details on the parentage of these crosses is in the associated article supplemental materials. 
  • Individual: The individual ID of the plant that pollen was collected from. Format is as in Maternal in 'Crosses_seed_counts.csv' above. 
  • Squares: the number of 1mmx1mmx0.1mm grid squares on a hemocytometer that were counted. Entire squares were always counted.
  • n_viable: the number of blue-stained (i.e. viable) pollen grains counted within the hemocytometer squares. 
  • n_inviable: the number of unstained (i.e. inviable) pollen grains counted within the hemocytometer squares. 
  • Date_counted: the date that pollen was counted using the hemocytometer.

'Eunanus_allsamples.complete.vcf.gz' is a variant call file containing SNP calls for 33 Eunanus section samples plus 5 outgroup samples in section Diplacus. The file contains both invariant sites and biallelic SNP variants. Sites have been filtered for various quality metrics, and to retain only sites for which at least 80% of samples have a called genotype. The file is in gzipped .vcf format and can be viewed using the bcftools utilities (https://samtools.github.io/bcftools/) or other open-source tools for reading vcf files.

'Eunanus_allsamples.SNPs.RAxML.tre' is a phylogenetic tree file in newick (text) format, giving the inferred relationships between samples. The tree was produced using the maximum-likelihood program RAxML (https://raxml-ng.vital-it.ch/#/) with model GTR+Gamma and an ascertainment bias correction. The dataset was an alignment constructed from the variant sites in 'Eunanus_allsamples.complete.vcf.gz', with heterozygous sites randomly called to one or the other allele. Bootstrap proportions (from 1000 rapid bootstraps in RAxML) are provided as branch labels.

'Eunanus_allsamples.SNPs.NJ.tre' is a phylogenetic tree file in newick (text) format, constructed using a neighbor-joining approach with the R package phangorn (https://cran.r-project.org/web/packages/phangorn/index.html). The dataset was the same alignment as the above RAxML tree. Bootstrap proportions (from 1000 bootstrap replicates) are included in the tree.

'Eunanus_allsamples.genes.ASTRAL.tre' is a phylogenetic tree file in newick (text) format, constructed using ASTRAL from a set of gene trees. Each gene tree was constructed using RAxML. The tree is provided in ASTRAL's full annotation format, with quartet and posterior probability information embedded in a branch label string.

Sharing/Access information

These data are associated with the following publication:

Farnitano, M.C. and Sweigart, A.L. Strong postmating reproductive isolation in Mimulus section Eunanus. Journal of Evolutionary Biology, Volume 36, Issue 10, 1 October 2023, Pages 1393–1410. https://doi.org/10.1111/jeb.14219

Corresponding Author:
Matthew C. Farnitano
mattfarnitano@gmail.com

An earlier version of this manuscript was preprinted at: https://doi.org/10.1101/2022.12.21.521469

Usage notes

Crossing data files are provided in .csv format. Variant calls are provided in .vcf.gz (gzipped variant call format), which can be opened and processed using bcftools (https://samtools.github.io/bcftools/) or another open-source software. Phylogenetic tree files are provided in newick tree format, which can be opened as a text file or by various tree visualization softwares. 

Funding

National Science Foundation, Award: DEB1856180

Office of the Director, Award: 5T32GM007103