Skip to main content

Data from: Genomics reveals the role of admixture in the evolution of structure among sperm whale populations within the Mediterranean Sea

Cite this dataset

Violi, Biagio et al. (2023). Data from: Genomics reveals the role of admixture in the evolution of structure among sperm whale populations within the Mediterranean Sea [Dataset]. Dryad.


In oceanic ecosystems, the nature of barriers to gene flow, and the processes by which populations may become isolated are different from the terrestrial environment, and less well understood. In this study, we investigate a highly mobile species (the sperm whale, Physeter macrocephalus) that is genetically differentiated between an open North Atlantic population and the populations in the Mediterranean Sea. We apply high-resolution single nucleotide polymorphisms (SNP) analysis to study the nature of barriers to gene flow in this system, comparing gene flow across the putative boundary into the Mediterranean (Strait of Gibraltar and Alboran Sea region) with novel analyses on structuring among sperm whale populations within the Mediterranean basin. Our data support a recent founding of the Mediterranean, around the time of the last glacial maximum, and shows concerted historical demographic profiles in both the Atlantic and the Mediterranean. In each region, there is evidence for a population decline around the time of the founder event, more extreme within the Mediterranean Sea where effective population size is substantially lower. While differentiation is strongest at the Atlantic/Mediterranean boundary, there is also significant differentiation between the Eastern and Western basins of the Mediterranean Sea. We propose, however, that the mechanisms are different. While post-founding gene flow was reduced between the Mediterranean and Atlantic populations, within the Mediterranean an important factor differentiating the basins is likely a greater degree of admixture between the Western basin and the North Atlantic.


Tissue samples were obtained during various research projects between 1999 and 2018. DNA was extracted both by kit (OMEGA BIOTEK and MN MACHEREY-NAGEL) following the manufacturer's protocol, and by the phenol chloroform method (after Hoelzel, 1998). Genomic DNA concentration was quantified using the Qubit High Sensitivity kit (Thermo Fisher Scientific). We applied the ddRADseq methodology (Peterson et al., 2012).

Hoelzel, A. R. (1998) Molecular analysis of populations; a practical approach. Oxford, UK: Oxford University Press.

Peterson, B. K., Weber, J. N., Kay, E. H., Fisher, H. S., & Hoekstra, H. E. (2012). Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PloS one, 7(5), e37135.

Usage notes

The two files, .ped and .map can be used within the R package SambaR (De Jong et al., 2021) in order to do SNP data management and analyses (


Once the dataset is imported, SNPs have to be filtered using the command filterdata(indmiss=0.5,snpmiss=0.05) to generate dataset A and filterdata(indmiss=0.25,snpmiss=0.05) for dataset B.


input file to run Byesass, Treemix, Admixture, fst in Arlequin can be generated using the command exportsambarfiles() 

PCoA, LEA, f4 statistics can be run using the command findstructure() in SambaR

de Jong, M. J., de Jong, J. F., Hoelzel, A. R., & Janke, A. (2021). SambaR: An R package for fast, easy, and reproducible populationgenetic analyses of biallelic SNP data sets. Molecular Ecology Resources, 21(4), 1369-1379.


University of Genoa