Skip to main content
Dryad

Data from: Geographic distribution and adaptive significance of genomic structural variants: an anthropological genetics perspective

Data files

Dec 30, 2015 version files 4.67 KB

Abstract

Anthropological geneticists have successfully used single nucleotide and short tandem repeat variations across human genomes to reconstruct human history. These markers have also been used extensively to identify adaptive and phenotypic variation. The recent advent of high-throughput genomic technologies revealed an overlooked type of genomic variation, namely structural variants (SVs). In fact, some SVs may contribute to human adaptation in substantial and previously unexplored ways. SVs include deletions, insertions, duplications, inversions and translocations of genomic segments that vary among individuals from the same species. SVs are much less numerous than single nucleotide variants, but account for at least seven times more variable base pairs than do single nucleotide variants when two human genomes are compared. Moreover, recent studies have shown that SVs have higher mutation rates than single nucleotide variants when the affected base pairs are considered, especially in certain parts of the genome. The null hypothesis for the evolution of SVs, like single nucleotide variants, is neutrality. Hence, drift is the primary force that shapes the current allelic distribution of most SVs. However, due to their size, a larger proportion of SVs appear to evolve under non-neutral forces (mostly purifying selection) than single nucleotide variants. In fact, as exemplified by several groundbreaking studies, SVs contribute to anthropologically relevant phenotypic variation and local adaption among humans. In this review, we argue that with the advent of affordable genomic technologies, anthropological scrutiny of genomic structural variation emerges as a fertile area of inquiry to better understand human phenotypic variation. To motivate potential studies, we discuss scenarios through which SVs affect phenotypic variation among humans within an anthropological context. We further provide a methodological workflow in which we analyzed 1000 Genomes deletion variants and identified 16 exonic deletions that are specific to the African continent. We analyzed two of these deletion variants affecting the keratin-associated protein (KAP) cluster in a locus-specific manner. Our analysis revealed that these deletions may indeed affect phenotype and likely evolved under geography-specific positive selection. We outline all the major software and datasets for these analyses and also provide the basic R and perl codes we used for this example workflow analysis. Overall, we hope that this review will encourage and facilitate incorporation of genomic structural variation in anthropological research programs.