Skip to main content

Quantifying and reducing cross-contamination in single- and multiplex hybridization capture of ancient DNA

Cite this dataset

Zavala, Elena et al. (2022). Quantifying and reducing cross-contamination in single- and multiplex hybridization capture of ancient DNA [Dataset]. Dryad.


The use of hybridization capture has enabled a massive upscaling in sample sizes for ancient DNA studies, allowing the analysis of hundreds of skeletal remains (Mathieson et al., 2015; Narasimhan et al., 2019) or sediments (Vernot et al., 2021; Wang et al., 2021; Zavala et al., 2021) in single studies. Yet demands in throughput continue to grow, and hybridization capture has become a limiting step in sample preparation due to the large consumption of reagents, consumables and time. Here we explore the possibility of improving the economics of sample preparation via multiplex capture, i.e. the hybridization capture of pools of double-indexed ancient DNA libraries. We demonstrate that this strategy is feasible for small genomic targets, such as mitochondrial DNA, if the annealing temperature is increased and PCR cycles are limited in post-capture amplification to avoid index swapping by jumping PCR, which manifests as cross-contamination in resulting sequence data. We also show that the re-amplification of double-indexed libraries to PCR plateau before or after hybridization capture can sporadically lead to small, but detectable cross-contamination even if libraries are amplified in separate reactions. We provide protocols for both manual capture and automated capture in 384-well format that are compatible with single- and multiplex capture and effectively suppress cross-contamination and artefact formation. Last, we provide a simple computational method for quantifying cross-contamination due to index swapping in double-indexed libraries, which we recommend using for routine quality checks in studies that are sensitive to cross-contamination. 


Each file in this dataset is an input file that was used in the paper "Quantifying and reducing cross-contamination in single- and multiplex hybridization capture of ancient DNA" to quantify the amount of cross-contamination from index swapping present per library using the script at

Each file consists of six tab seperated columns (with a header) where each line denotes the number of sequences assigned (column 1:"#seqs") to an index pair before any filtering has been performed. The sequences for each index are provided in columns 2 and 4 with their unique identifier (used at MPI-EVA, Leipzig) provided in columns 3 and 5. The 6th column provides read group ("RG") information. If the index pair is for a library expected in the sequencing pool there will be a library ID in the RG column. If there is an index paair that was not expected to be present in the sequencing pool the RG is noted as "unexpected". If only a single index is recognized the RG is noted as "unknown". 

The title of the files indicate which parameters were being tested. The "Original" files are from the sequencing of the first and second capture rounds from each singleplex and multiplex capture using the automated capture protocol outlined in Fu et al., 2013 with adjustments described in Slon et al., 2017 implemented in 384-well format (Zavala, et al., 2021). The "ShortPCRcycles" files follow the same capture protocol, however with the PCR amplification cycles limited to prevent any libraries from reaching PCR plateau. The "AnnealingTemp" files are contain sequencing data from 92 pooled un-captured libraries that have been amplified to plateau using primers with different annealing temperatures. The "New" files are from sequencing data with the finalized, updated manual and automated capture protocols in singleplex and multiplex format. 

Fu, Q., Meyer, M., Gao, X., Stenzel, U., Burbano, H. A., Kelso, J., & Paabo, S. (2013). DNA analysis of an early modern human from Tianyuan Cave, China. Proc Natl Acad Sci U S A, 110(6), 2223-2227. doi:10.1073/pnas.1221359110

Slon, V., Hopfe, C., Weiss, C. L., Mafessoni, F., de la Rasilla, M., Lalueza-Fox, C., . . . Meyer, M. (2017). Neandertal and Denisovan DNA from Pleistocene sediments. Science, 356(6338), 605-608. doi:10.1126/science.aam9695

Zavala, E. I., Jacobs, Z., Vernot, B., Shunkov, M. V., Kozlikin, M. B., Derevianko, A. P., . . . Meyer, M. (2021). Pleistocene sediment DNA reveals hominin and faunal turnovers at Denisova Cave. Nature, 595(7867), 399-403. doi:10.1038/s41586-021-03675-0

Usage notes

See Methods or ReadMe for details.