Genome divergence during speciation is a dynamic process that is affected by various factors, including the genetic architecture of barriers to gene flow. Herein we quantitatively describe aspects of the genetic architecture of two sets of traits, male genitalic morphology and oviposition preference, that putatively function as barriers to gene flow between the butterfly species Lycaeides idas and L. melissa. Our analyses are based on unmapped DNA sequence data and a recently developed Bayesian regression approach that includes variable selection and explicit parameters for the genetic architecture of traits. A modest number of nucleotide polymorphisms explained a small to large proportion of the variation in each trait, and average genetic variant effects were non-negligible. Several genetic regions were associated with variation in multiple traits or with trait variation within- and among-populations. In some instances genetic regions associated with trait variation also exhibited exceptional genetic differentiation between speices or exceptional introgression in hybrids. These results are consistent with the hypothesis that divergent selection on male genitalia has contributed to heterogeneous genetic differentiation, and that both sets of traits affect fitness in hybrids. Although these results are encouraging, we highlight several difficulties related to understanding the genetics of speciation.
oviposition
This file contains the oviposition preference data. The file includes the following fields: population name (pop), individual number (indnum), individual id (indid), sex, number of eggs laid on Astragalus miser (nast), and the number of eggs laid on Medicago sativa (nmed).
morphology
This file contains the male genitalic morphology measurements. The file includes fields for species, population (Pop), individual id (Ind), whether the left or right side of the genitalia was measured (Meas), and for each of the nine measurments. These measurements are described in the manuscript.
afsource.tar
This file contains source code for computer software that implements the allele frequency model described in the manuscript. Please e-mail zgompert@gmail.com for additional details, and full documentation will likely be released on-line in the future. 1. The software depend on the Gnu Scientific Library. The software can be compiled using the g++ compiler as follows (this assumes the GSL was isntalled in a standard location), g++ -o af main.C func.C -lm -lgsl -lgslcblas 2. You can view command line arguments to the software by typing the name of the compiled binary. The infile format is similar to the infile format for bgc. 3. Three output files are produced. One contains the posterior probabilities for the genotype at each locus for each individual (gprob). A second file contains a matrix with samples from the posterior probability distribution for the allele frequencies (afreq) with each row corresponding to an mcmc iteration and each column corresponding to a locus. The final file (alpha) contains samples from the posterior probability distribution of the genetic diversity parameter.
allele counts
This zipped file contains the number of reads matching each of two alternative allels for each indvidual and locus. Loci are delineated by a 'contig' line that gives the fragment number, position of the SNP on the pseudo-reference genome, and the estimated average error probability for the SNP. The other lines begin with an individual id, which is follwed by the number of read of each allele.
alleleCounts.txt.gz