Skip to main content

Correlated and geographically predictable Neanderthal and Denisovan legacies are difficult to reconcile with a simple model based on inter-breeding

Cite this dataset

Amos, William (2021). Correlated and geographically predictable Neanderthal and Denisovan legacies are difficult to reconcile with a simple model based on inter-breeding [Dataset]. Dryad.


Although the presence of archaic hominin legacies in humans is taken for granted, little attention has been given as to how the data fit with how humans colonised the world.  Here I show that Neanderthal and Denisovan legacies are strongly correlated and that, like heterozygosity, distance from Africa predicts legacy size.  Simulations confirm that, once created, legacy size is extremely stable: it may reduce through admixture with lower legacy populations but cannot increase detectably through neutral drift.  Consequently, populations carrying the highest legacies must also be those whose ancestors inter-bred most with archaics.  However, the populations with the highest legacies are globally scattered and are unified, not by having origins within the known Neanderthal range, but instead by living in locations that lie furthest from Africa.  Furthermore, the Simons Genome Diversity Project data reveal two very distinct correlations between Neanderthal and Denisovan legacies, one that starts in North Africa and increases west to east across Eurasia and into some parts of Oceania, and a second, much steeper trend that starts in Africa, peaking with the San and Ju/’hoansi and which, if extrapolated, predicts the large inferred legacies of both archaics found in Oceania / Australia.  These trends are difficult to reconcile with classical models of how introgression occurred but may fit a speculative model in which the loss of diversity that occurred when humans moved further from Africa created a gradient in heterozygosity that in turn progressively reduced mutation rate such that populations furthest from Africa have diverged less from our common ancestor and hence from the archaics.  The two distinct trends could be interpreted in terms of two ‘out of Africa’ events, an early one ending in Oceania and Australia and a later one that colonised Eurasia and the Americas.


These data files contain the processed outputs from individual chromosome vcf files for the Denisovan genome, downloaded from  For analyses presented in this paper I focused only on homozygote archaic bases, accepting only those with 10 or more reads, fewer than 250 reads and where >80% of reads were of one particular base.  Each file contains, in order, the following fields: location in basepairs, human reference base, followed by counts for each read of bases A, C, G and T. 

Usage notes

Once unzipped, these files are in the correct format for use by the C++ script used to calculate nd10 values for the 1000 genomes populations.  For companion files for chimpanzee aligments and equivalent files for the Neanderthal genomes, please go to the Dryad entry for "Signals interpreted as archaic introgression are driven primarily by accelerated evolution in Africa".