Data from: High-throughput amplicon sequencing of rRNA genes requires a copy number correction to accurately reflect the effects of management practices on soil nematode community structure
Darby, Brian J., Kansas State University
Todd, Tim C., Kansas State University
Herman, Michael A., Kansas State University
Published Aug 01, 2013 on Dryad.
Cite this dataset
Darby, Brian J.; Todd, Tim C.; Herman, Michael A. (2013). Data from: High-throughput amplicon sequencing of rRNA genes requires a copy number correction to accurately reflect the effects of management practices on soil nematode community structure [Dataset]. Dryad. https://doi.org/10.5061/dryad.t8g16
Nematodes are abundant consumers in grassland soils, but more sensitive and specific methods of enumeration are needed to improve our understanding of how different nematode species affect, and are affected by, ecosystem processes. High-throughput amplicon sequencing is used to enumerate microbial and invertebrate communities at a high level of taxonomic resolution, but the method requires validation against traditional specimen-based morphological identifications. To investigate the consistency between these approaches, we enumerated nematodes from a 25-year field experiment using both morphological and molecular identification techniques in order to determine the long-term effects of annual burning and nitrogen enrichment on soil nematode communities. Family-level frequencies based on amplicon sequencing were not initially consistent with specimen-based counts, but correction for differences in rRNA gene copy number using a genetic algorithm improved quantitative accuracy. Multivariate analysis of corrected sequence-based abundances of nematode families was consistent with, but not identical to, analysis of specimen-based counts. In both cases, herbivores, fungivores and predator/omnivores generally were more abundant in burned than nonburned plots, while bacterivores generally were more abundant in nonburned or nitrogen-enriched plots. Discriminate analysis of sequence-based abundances identified putative indicator species representing each trophic group. We conclude that high-throughput amplicon sequencing can be a valuable method for characterizing nematode communities at high taxonomic resolution as long as rRNA gene copy number variation is accounted for and accurate sequence databases are available.
This comma-delimited file contains the raw number of sequences for each of 131 "species" (Genbank accessions) for 64 field samples (16 plots x 2 seasons x 2 reps) and two "test" samples. Field samples are coded as with (1) or without (0) burning or nitrogen enrichment.
This comma-delimited file contains the raw abundance (individuals per 100g) for each of 28 families for 64 field samples (16 plots x 2 seasons x 2 reps). Field samples are coded as with (1) or without (0) burning or nitrogen enrichment.
Region 1 (subsamples "A") raw 454 sequencing reads
Text files with all raw 454 amplicon sequencing reads from region 1, containing all "A" subsamples, in FastA format. This is one of the datafiles (along with "region 2") used for pre-processing with the python script "amPy30.py"
Amplicon sequencing pipeline
BASH script used to call sequence processing programs. Internal dependencies: amPy30.py, blastAnalysis30.py. External dependencies: Requires cutadapt 1.0, USEARCH 5.1, ssaha2.
Python script used to pre-process raw sequencing reads. Calls on three data files: 2 raw sequencing read files in FastA format ("X.TCA.454Reads.fna"), and "barcodeIDs.txt"
TXT file mapping the multiplex identifier barcodes from each read to their respective sample and subsample ID. Treatment information (season, burning, and nitrogen) are also indicated in the file.
Eukaryotic SSU databased used in this study. Called for by AmpSeqPpipeline30.sh.
Python script used to process results of matching non-redundant sequencing reads to their database match. Called for by "AmpSeqpipeline30.sh"