Optimizing DNA extraction protocols for the diet analysis of a baleen whale (Eubalaena australis)
Data files
Oct 17, 2024 version files 330.36 KB
- 
              
                16S-asv-table.csv
                80.07 KB
- 
              
                18S-asv-table.csv
                223.63 KB
- 
              
                protocol_levels.csv
                24.56 KB
- 
              
                README.md
                2.10 KB
May 06, 2025 version files 1.25 MB
- 
              
                16S-18S-raw-data.xlsx
                919.58 KB
- 
              
                16S-asv-table.csv
                80.07 KB
- 
              
                18S-asv-table.csv
                223.63 KB
- 
              
                protocol_levels.csv
                24.56 KB
- 
              
                README.md
                2.82 KB
Oct 09, 2025 version files 1.25 MB
- 
              
                16S-18S-raw-data.xlsx
                919.58 KB
- 
              
                16S-asv-table.csv
                80.07 KB
- 
              
                18S-asv-table.csv
                223.63 KB
- 
              
                protocol_levels.csv
                24.56 KB
- 
              
                README.md
                2.57 KB
Abstract
Faecal metabarcoding is widely used for mammalian diet analysis. However, most extraction protocols are designed to target high molecular weight genomic DNA and not the short DNA sequences associated with digested prey items. To examine the prey composition in southern right whale (Eubalaena australis) faecal samples we trialled a phosphate buffer DNA extraction method along with two commercial extraction kits (QIAamp Fast DNA Stool Mini and QIAGEN DNEasy PowerSoil) with the following variations: 1) incubation time in a phosphate buffer (1 and 24 hours); 2) processing both pellet and supernatant from the phosphate buffer incubations; and 3) two different concentrations of DNA binding buffer. We found that the choice of extraction protocol influenced the richness, diversity and composition of eukaryotes (18S rDNA) and crustaceans (Crust16S mtDNA) in the faecal samples. The PowerSoil protocol performed well for both markers, delivering the highest target richness for 18S rDNA and highest diversity for Crust16S mtDNA, with the pellet of the phosphate buffer also performing comparably. Taxonomic composition in the phosphate buffer supernatant was influenced by the incubation period and concentration of binding buffer and differed from its corresponding pellet. To maximise taxonomic coverage, we recommend combining the extracts from both the supernatant and pellet. In faecal studies, our findings reinforce the importance of defining the community attributes (richness versus diversity versus composition) of key interest prior to performing DNA extraction, as the inference of these variables is likely to be altered by the choice of extraction protocol.
https://doi.org/10.5061/dryad.9s4mw6mrx
Description of the data and file structure
Metadata associated with this study consist of OTU tables for 18S rDNA eukaryote and 16S mtDNA crustacean datasets obtained from metabarcoding SRW faecal samples for the targeted amplification of prey DNA. These are available in the two .csv files, "18S-asv-table.csv" and "16S-asv-table.csv", which contain the OTU tables for the eukaryote and crustacean datasets respectively. A third file, "protocol_levels.csv", contains metadata associated with the extraction protocols, and is required to run the R script for analysis. Additionally, an excel file "16S-18S-raw-data.xlsx" contains the raw OTU tables with both target and non-target taxa prior to dataset cleaning. This file has been used for supplementary analyses with results available in the supplemental information.
Supplementary tables and figures (Zenodo)
Supplementary tables and figures can be found in the file "Supplemental_Information_Parikh-etal.pdf".
Code (Zenodo)
All coding for this study was performed in R, and the associated R script can be found in the file "Parikh-et-al-2024.R".
Files and variables
File: 16S-asv-table.csv, 18S-asv-table.csv
Description:
This file is in the format of an OTU table, with OTUs as rows and samples as columns. The first column of this file contains the unique OTU identifier, followed by columns that list the taxonomic classification of each sample (columns B-H), and a column with the total reads in each OTU (column I). The 10th column onwards (i.e. column J) consists of samples with the number of reads of each OTU detected within.
File: protocol_levels.csv
Description:
This file contains brief metadata associated with the extraction protocols which is used in the corresponding R script to perform analyses.
File: 16S-18S-raw-data.xlsx
Description:
This file contains raw OTU data for Crust16S mtDNA and 18S rDNA datasets, prior to dataset cleaning for the analysis included in the main manuscript.
Code/software
All analysis was performed in the software R (version 4.3.1). The code used can be found in the R script file "Parikh-et-al-2024.R" which lists all the R packages required to perform the analysis as well as brief comments explaining what analyses are being performed. All files and datasets required to run the analysis have been included.
Southern right whale faecal samples were collected opportunistically over decades of research. Faecal samples underwent DNA extraction using three different methods (phosphate buffer extraction, PowerSoil kit and QIAamp FAST DNA Stool Mini Kit) with modifications in each leading to 12 unique protocols. The modifications included: incubating samples for 1 hr and 24 hrs in a phosphate buffer, processing both the pellet and supernatant from phosphate buffer incubation, and the addition of 1x and 2x DNA binding buffer to the silica column. Extracted samples were amplified via a universal 18S and crustacean-specific 16S marker and sequenced. Analyses for taxonomic richness, diversity, and composition were performed in R to compare results from the different protocols.
Changes after Oct 17, 2024: An additional Excel file with raw reads data for the full 18S rDNA and Crust16S mtDNA datasets was added, and the Supplementary material was updated to reflect reviewer feedback with additional analyses on the raw data included.
Changes after May 6, 2025: Supplemental_Information_Parikh-etal.pdf was added and some copyediting of the Abstract.
