Skip to main content
Dryad

A novel method to assess the integrity of frozen archival DNA samples: Alpha-diversity ratios of short and long-read 16S rRNA gene sequences

Data files

Abstract

Archival DNA samples collected and analysed for a range of research and applied questions have accumulated in the laboratories of universities, government agencies, and commercial service providers for decades. These DNA archives represent a valuable, yet largely untapped repository of genomic information. With lowering costs of, and increasing access to, high-throughput sequencing, we predict an increase in retrospective research to explore the wealth of information that resides in these archival samples. However, for this to occur, we need confidence in the integrity of the DNA samples, often stored under sub-optimal conditions and their fitness of purpose for downstream genomic analysis. Here, we borrow from a well-established concept in ancient DNA to evaluate sample integrity, defined as loss of information content in recovered amplicons, of frozen DNA samples and based on the ratio of ⍺-diversity of short and long-read 16S rRNA gene sequences. The 16S rRNA variable region of eighty-seven stored DNA samples, extracted from soil, collected from western and southern agricultural regions of Australia between 2001 to 2021, were sequenced using both PacBio full-length reads (V1-V9, 1.5 kbp) and Illumina short-reads (V3-V4, 200-450 bp). When ⍺-diversity ratios were calculated between the long and short reads to assess DNA degradation, the ratio of ⍺-diversity did not decrease in older samples versus younger samples. We suggest this as a novel method to confirm the integrity of DNA before embarking on large-scale diversity profiling projects using archival DNA.