Data from: Estimating belowground plant abundance with DNA metabarcoding


Most work on plant community ecology has been performed aboveground, neglecting the processes that occur in the soil. DNA metabarcoding, where multiple species are computationally identified in bulk samples, can help overcome the logistical limitations involved in sampling plant communities belowground. A major limitation of this methodology is, however, the quantification of species’ abundances based on the percentage of sequences assigned to each taxon. Using root tissues of the five dominant species in a semiarid Mediterranean shrubland (Bupleurum fruticescens, Helianthemum cinereum, Linum suffruticosum, Stipa pennata and Thymus vulgaris), we built pairwise mixtures of relative abundance (20, 50 and 80% biomass), and implemented two methods (linear models fits and correction indices) to improve root biomass estimates. We validated both methods with multispecies mixtures that simulate field-collected samples. For all species, we found a positive and highly significant relationship between the percentage of sequences and biomass in the mixtures (R2 = 0.44-0.66), but the equations for each species (slope and intercept) differed among them, and two species were consistently over- and under-estimated. The correction indices greatly improved the estimates of biomass percentage for all five species in the multispecies mixtures, and reduced the overall error from 17% to 6%. Our results show that, through the use of post-sequencing quantification methods on mock communities, DNA metabarcoding can be effectively used to determine not only species’ presence but also their relative abundance in field samples of root mixtures. Importantly, knowledge on these aspects will allow to study key, yet poorly understood, belowground processes.

