Data from: Non-specific amplification compromises environmental DNA metabarcoding with COI


1. Metabarcoding extra-organismal DNA from environmental samples is now a key technique in aquatic biomonitoring and ecosystem health assessment. However, choice of genetic marker and primer set is a critical consideration when designing experiments, especially so when developing community standards and legislative frameworks. Mitochondrial cytochrome c oxidase subunit I (COI), the standard DNA barcode marker for animals, with its extensive reference library, taxonomic discriminatory power, and predictable sequence variation, is the natural choice for many metabarcoding applications such as the bulk sequencing of invertebrates. However, the overall utility of COI for environmental sequencing of targeted taxonomic groups has yet to be fully scrutinised. 2. Here, by using a case study of marine and freshwater fishes from the British Isles, we quantify the in silico performance of twelve mitochondrial primer pairs from COI, cytochrome b, 12S and 16S, in terms of reference library coverage, taxonomic discriminatory power, and primer universality. We subsequently test in vitro three COI primer pairs and one 12S pair for their specificity, reproducibility, and congruence with independent datasets derived from traditional survey methods at five estuarine and coastal sites in the English Channel and North Sea coast. 3. Our results show that for aqueous extra-organismal DNA at low template concentrations, both metazoan and fish-targeted COI primers perform poorly in comparison to 12S, exhibiting low levels of reproducibility due to non-specific amplification of prokaryotic and non-target eukaryotic DNAs. 4. An ideal metabarcode would have an extensive reference library for which custom primer sets can be designed for either broad assessments of biodiversity or taxon specific surveys, but unfortunately, low primer specificity hinders the use of COI, while the paucity of reference sequences is problematic for 12S. The latter, however, can be mitigated by expanding the concept of DNA barcodes to include whole mitochondrial genomes generated by genome-skimming existing tissue collections.

