Skip to main content

Data from: A next-generation sequencing approach to river biomonitoring using benthic diatoms

Cite this dataset

Kermarrec, Lenaïg et al. (2014). Data from: A next-generation sequencing approach to river biomonitoring using benthic diatoms [Dataset]. Dryad.


Diatoms are main bioindicators used to assess the ecological quality of rivers, but their identification is difficult and time-consuming. Next Generation Sequencing (NGS) can be used to study communities of microorganisms, so we carried out a test of the reliability of 454 pyrosequencing for estimating diatom inventories in environmental samples. We used small subunit ribosomal deoxyribonucleic acid (SSU rDNA), ribulose-1, 5-bisphosphate carboxylase (rbcL), and cytochrome oxidase I (COI) markers and examined reference libraries to define thresholds between the intra- and interspecific and intra- and intergeneric genetic distances. Based on tests of 1 mock community, we used a threshold of 99% identity for SSU rDNA and rbcL sequences to study freshwater diatoms at the species level. We applied 454 pyrosequencing to 4 contrasting environmental samples (with one in duplicate), assigned taxon names to environmental sequences, and compared the qualitative and quantitative molecular inventories to those obtained by microscopy. Species richness detected by microscopy was always higher than that detected by pyrosequencing. Some morphologically detected taxa may have been persistent frustules from dead cells. Some taxa detected by molecular analysis were not detected by morphology and vice versa. The main source of divergence appears to be inadequate taxonomic coverage in DNA reference libraries. Only a small percentage of species (but almost all genera) in morphological inventories were included in DNA reference libraries. DNA reference libraries contained a smaller percentage of species from tropical (27.1–38.1%) than from temperate samples (53.7–77.8%). Agreement between morphological and molecular inventories was better for species with relative abundance >1% than for rare species. The rbcL marker appeared to provide more reproducible results (94.9% species similarity between the 2 duplicates) and was very useful for molecular identification, but procedural standardization is needed. The water-quality ranking assigned to a site via the Pollution Sensitivity diatom index was the same whether calculated with molecular or morphological data. Pyrosequencing is a promising approach for detecting all species, even rare ones, once reference libraries have been developed.

Usage notes


La Réunion
Mainland France