Coalescent theory has provided a basis for evolutionary biologists to build sophisticated methods for inferring population history from variation in genetic markers, but these methods leave out a major conceptual cornerstone of modern evolutionary theory: natural selection. I provide the first quantitative analysis of the effects of selection on genealogical patterns in a continuously distributed population in which the selective optimum for a trait linked to the marker varies gradually and continuously across the landscape. Simulations show that relatively weak selection for local adaptation can lead to strong phylogeographic structure, in which highly divergent genealogical groups (i.e. clades) are geographically localized and differentially adapted, and dramatically increased standing variation (e.g. coalescence time) compared to neutral expectations. This pattern becomes more likely with increasing population size and with decreasing dispersal distances, mutation rates, and mutation sizes. Under some conditions, the system alternates between a nearly-neutral behavior and a behavior in which highly divergent clades are locally adapted. Natural selection on markers commonly used in phylogeographic studies (such as mitochondrial DNA) presents a major challenge to the inference of biogeographic history but also provides exciting opportunities to study how selection affects both between- and within-species biodiversity.
Irwin_2012_AmNat_Dryad_submission
This file was prepared by Darren E. Irwin on March 3, 2012, and contains data from the following paper: Irwin, Darren E. 2012. Local adaptation along smooth ecological gradients causes phylogeographic breaks and phenotypic clustering. The American Naturalist, in press. | The format of the data is as follows: | For each set of simulations (i.e., run under the same parameter values), I first provide values for the following parameters: | set_name (an arbitrary 1- or 2-letter code to identify that set) | N (initial total population size across entire range) | sigma_disp (the width of the dispersal curve) | sigma_w (the width of the selection curve at each geographic location; the smaller this is, the stronger the selection) | mu (the mutation rate, per individual per generation) | sigma_mut (the width of the mutation size curve) | The parameters for the set are then followed by the genealogies and trait values for a series of independent simulations, as follows: | I provide the name of the genealogy (the set_name plus an integer indicating the particular simulation), followed by a long string containing the genealogy of 60 individuals from the population, in typical Newick or Nexus format (here, they are the same); this genealogy string can be copied and pasted into a program such as TreeView or some other tree-drawing program. | Individuals in the genealogy are identified by an integer as follows: | Individuals 1-10: Closest to location x=0.0 | Individuals 11-20: Closest to location x=0.2 | Individuals 21-30: Closest to location x=0.4 | Individuals 31-40: Closest to location x=0.6 | Individuals 41-50: Closest to location x=0.8 | Individuals 51-60: Closest to location x=1.0 | Branch lengths in the genealogy represent the number of generations. | Following the genealogy, I provide a table showing the trait value (y) of each individual, using the same integer identifiers of individuals as in the genealogy. | Data from different sets of simulations are separated by: -------------------- | Data from different simulations within a set are separated by: --