Data from: Evaluating the use of ABBA-BABA statistics to locate introgressed loci
Data files
Sep 30, 2014 version files 39.89 MB
-
compare_f_estimators.r
18 KB
-
egglib_sliding_windows.py
27.42 KB
-
Figure_1.R
2.75 KB
-
Figure_4.R
7.32 KB
-
Figure_5.R
3.77 KB
-
Figure_S2.R
9.33 KB
-
Figure_S4.R
7 KB
-
Figures_3_S3.R
6.42 KB
-
generate_summary_statistics.R
11.89 KB
-
Heliconius_autosome_windows_10kb.csv
7.29 MB
-
Heliconius_autosome_windows_20kb.csv
3.77 MB
-
Heliconius_autosome_windows_50kb.csv
1.93 MB
-
Heliconius_autosome_windows_5kb.csv
17.05 MB
-
Heliconius_Zchromosome_windows_10kb.csv
251.58 KB
-
Heliconius_Zchromosome_windows_20kb.csv
131.12 KB
-
Heliconius_Zchromosome_windows_50kb.csv
61.63 KB
-
Heliconius_Zchromosome_windows_5kb.csv
567.23 KB
-
Hmel1-1_Zupdated_Zscafs.txt
996 B
-
model_files_win10000_s0.01_l5000_r5.alternate_models.dxy.summary.sg.tsv
3.58 MB
-
model_files_win10000_s0.01_l5000_r5.alternate_models.partition.summary.sg.tsv
262.81 KB
-
model_files_win10000_s0.01_l5000_r5.null_models.dxy.summary.sg.tsv
341.64 KB
-
model_files_win10000_s0.01_l5000_r5.null_models.partition.summary.sg.tsv
22.55 KB
-
model_files_win10000_s0.01_l5000_r50.alternate_models.dxy.summary.sg.tsv
3.48 MB
-
model_files_win10000_s0.01_l5000_r50.alternate_models.partition.summary.sg.tsv
264.28 KB
-
model_files_win10000_s0.01_l5000_r50.null_models.dxy.summary.sg.tsv
345.30 KB
-
model_files_win10000_s0.01_l5000_r50.null_models.partition.summary.sg.tsv
22.65 KB
-
Model_parameter_list.csv
14.73 KB
-
model_results_table.R
4.81 KB
-
model_results_table.txt
360.32 KB
-
README.txt
6.49 KB
-
run_model_combinations.py
5.97 KB
-
shared_ancestry_simulator.R
18.58 KB
Abstract
Several methods have been proposed to test for introgression across genomes. One method tests for a genome-wide excess of shared derived alleles between taxa using Patterson's D statistic, but does not establish which loci show such an excess or whether the excess is due to introgression or ancestral population structure. Several recent studies have extended the use of D by applying the statistic to small genomic regions, rather than genome-wide. Here, we use simulations and whole genome data from Heliconius butterflies to investigate the behavior of D in small genomic regions. We find that D is unreliable in this situation as it gives inflated values when effective population size is low, causing D outliers to cluster in genomic regions of reduced diversity. As an alternative, we propose a related statistic f̂d, a modified version of a statistic originally developed to estimate the genome-wide fraction of admixture. f̂d is not subject to the same biases as D, and is better at identifying introgressed loci. Finally, we show that both D and f̂d outliers tend to cluster in regions of low absolute divergence (dXY), which can confound a recently proposed test for differentiating introgression from shared ancestral variation at individual loci.