Skip to main content

Metabarcoding for biodiversity inventory blind spots: A test case using the beetle fauna of an insular cloud forest

Cite this dataset

Arjona, Yurena et al. (2022). Metabarcoding for biodiversity inventory blind spots: A test case using the beetle fauna of an insular cloud forest [Dataset]. Dryad.


Soils harbour a rich arthropod fauna, but many species are still not formally described (Linnaean shortfall), and the distribution of those already described is poorly understood (Wallacean shortfall). Metabarcoding holds much promise to fill this gap, however, nuclear copies of mitochondrial genes, and other artefacts lead to taxonomic inflation, which compromises the reliability of biodiversity inventories. Here we explore the potential of a bioinformatic approach to jointly “denoise” and filter non-authentic mitochondrial sequences from metabarcode reads to obtain reliable soil beetle inventories and address open questions in soil biodiversity research, such as the scale of dispersal constraints in different soil layers. We sampled cloud forest arthropod communities from 49 sites in the Anaga peninsula of Tenerife (Canary Islands). We performed whole organism community DNA (wocDNA) metabarcoding, and built a local reference database with COI barcode sequences of 310 species of Coleoptera for filtering reads and the identification of metabarcoded species. This resulted in reliable haplotype data after considerably reducing nuclear mitochondrial copies and other artefacts. Comparing our results with previous beetle inventories, we found: (i) new species records, potentially representing undescribed species; (ii) new distribution records, and; (iii) validated phylogeographic structure when compared with traditional sequencing approaches. Analyses also revealed evidence for higher dispersal constraint within deeper soil beetle communities, compared to those closer to the surface. The combined power of barcoding and metabarcoding contribute to mitigate the important shortfalls associated with soil arthropod diversity data, and thus address unresolved questions for this vast biodiversity fraction.


Extended local reference database: mtDNA COI barcode sequences used as a reference to identify metabarcoding reads and to filter putative non-authentic mitochondrial reads by using metaMATE. These sequences have been obtained from two sources: (1) Sanger sequencing of the COI barcode region of beetles collected in different places across the island of Tenerife; and (2) sequences from the NCBI nt database that matched metabarcoding reads.

Metabarcoding raw data: metabarcoding raw sequences of the COI barcode region of 98 libraries obtained by sampling the beetle soil community in the Anaga laurel forest of the island of Tenerife.

Metabarcoding community table: community table with the distribution of metabarcoding ASVs (amplicon sequence variants) on each library after denoising and metaMATE filtering.


Agencia Estatal de Investigación, Award: CGL2015-74178-JIN

European Commission, Award: 705639

Agencia Estatal de Investigación, Award: CGL2017-85718-P

Agencia Estatal de Investigación, Award: PID2020-116788GB-I00

Fundación CajaCanarias, Award: 2017RCE03