A narrow window for geographic cline analysis using genomic data: effects of age, drift, and migration on error rates
Jofre, Gaston; Rosenthal, Gil (2021), A narrow window for geographic cline analysis using genomic data: effects of age, drift, and migration on error rates, Dryad, Dataset, https://doi.org/10.5061/dryad.0p2ngf1z9
The use of genomic and phenotypic data to scan for outliers is a mainstay for studies of hybridization and speciation. Geographic cline analysis of natural hybrid zones is widely used to identify putative signatures of selection by detecting deviations from baseline patterns of introgression. As with other outlier-based approaches, demographic histories can make neutral regions appear to be under selection and vice versa. In this study, we use a forward-time individual-based simulation approach to evaluate the robustness of geographic cline analysis under different evolutionary scenarios. We modeled multiple stepping-stone hybrid zones with distinct age, deme sizes, and migration rates, and evolving under different types of selection. We found that drift distorts cline shapes and increases false positive rates for signatures of selection. This effect increases with hybrid zone age, particularly if migration between demes is low. Drift can also distort the signature of deleterious effects of hybridization, with genetic incompatibilities and particularly underdominance prone to spurious typing as adaptive introgression. Our results suggest that geographic clines are most useful for outlier analysis in young hybrid zones with large populations of hybrid individuals. Current approaches may overestimate adaptive introgression and underestimate selection against maladaptive genotypes.
Our simulated dataset consisted of 36 types of hybrid zones, simulated in Admix'em. 12 without selection, 12 with directional selection and underdominance, and 12 with BDM interactions. We calculated average allele frequencies per marker per deme, and average genome wide hybrid index per deme. We then fitted geographic clines in all markers. A marker was considered an outlier if the 95% confidence intervals from either center or width, did not overlap with the confidence intervals from the genome-wide average cline. With this process we categorized all marker than did not underwent selection in the simulations as false positives or true negatives. We categorized markers simulated under directional selection as false negatives or true positives. And we categorized markers under underdominance and in a BDM incompatibility as false negatives, true positives, or spurious outliers.
National Institutes of Health, Award: R01GM121750
National Science Foundation, Award: 1354172
National Science Foundation, Award: 1755327