Optimizing DNA extraction protocols for the diet analysis of a baleen whale (Eubalaena australis)

Parikh, Aashi 1 ; O'Rorke, Richard2 ; Carroll, Emma2; Vermeulen, Els 3 ; Harcourt, Robert 1 ; Chariton, Anthony 1

Research facility: Macquarie University

Published Oct 17, 2024; Updated Oct 09, 2025 on Dryad. https://doi.org/10.5061/dryad.9s4mw6mrx

Data files

Oct 17, 2024 version files 330.36 KB

16S-asv-table.csv

80.07 KB
18S-asv-table.csv

223.63 KB
protocol_levels.csv

24.56 KB
README.md

2.10 KB

May 06, 2025 version files 1.25 MB

16S-18S-raw-data.xlsx

919.58 KB
16S-asv-table.csv

80.07 KB
18S-asv-table.csv

223.63 KB
protocol_levels.csv

24.56 KB
README.md

2.82 KB

Oct 09, 2025 version files 1.25 MB

16S-18S-raw-data.xlsx

919.58 KB
16S-asv-table.csv

80.07 KB
18S-asv-table.csv

223.63 KB
protocol_levels.csv

24.56 KB
README.md

2.57 KB

Abstract

Faecal metabarcoding is widely used for mammalian diet analysis. However, most extraction protocols are designed to target high molecular weight genomic DNA and not the short DNA sequences associated with digested prey items. To examine the prey composition in southern right whale (Eubalaena australis) faecal samples we trialled a phosphate buffer DNA extraction method along with two commercial extraction kits (QIAamp Fast DNA Stool Mini and QIAGEN DNEasy PowerSoil) with the following variations: 1) incubation time in a phosphate buffer (1 and 24 hours); 2) processing both pellet and supernatant from the phosphate buffer incubations; and 3) two different concentrations of DNA binding buffer. We found that the choice of extraction protocol influenced the richness, diversity and composition of eukaryotes (18S rDNA) and crustaceans (Crust16S mtDNA) in the faecal samples. The PowerSoil protocol performed well for both markers, delivering the highest target richness for 18S rDNA and highest diversity for Crust16S mtDNA, with the pellet of the phosphate buffer also performing comparably. Taxonomic composition in the phosphate buffer supernatant was influenced by the incubation period and concentration of binding buffer and differed from its corresponding pellet. To maximise taxonomic coverage, we recommend combining the extracts from both the supernatant and pellet. In faecal studies, our findings reinforce the importance of defining the community attributes (richness versus diversity versus composition) of key interest prior to performing DNA extraction, as the inference of these variables is likely to be altered by the choice of extraction protocol.

https://doi.org/10.5061/dryad.9s4mw6mrx

Description of the data and file structure

Metadata associated with this study consist of OTU tables for 18S rDNA eukaryote and 16S mtDNA crustacean datasets obtained from metabarcoding SRW faecal samples for the targeted amplification of prey DNA. These are available in the two .csv files, "18S-asv-table.csv" and "16S-asv-table.csv", which contain the OTU tables for the eukaryote and crustacean datasets respectively. A third file, "protocol_levels.csv", contains metadata associated with the extraction protocols, and is required to run the R script for analysis. Additionally, an excel file "16S-18S-raw-data.xlsx" contains the raw OTU tables with both target and non-target taxa prior to dataset cleaning. This file has been used for supplementary analyses with results available in the supplemental information.

Supplementary tables and figures (Zenodo)

Supplementary tables and figures can be found in the file "Supplemental_Information_Parikh-etal.pdf".

Code (Zenodo)

All coding for this study was performed in R, and the associated R script can be found in the file "Parikh-et-al-2024.R".

Files and variables

File: 16S-asv-table.csv, 18S-asv-table.csv

Description:

This file is in the format of an OTU table, with OTUs as rows and samples as columns. The first column of this file contains the unique OTU identifier, followed by columns that list the taxonomic classification of each sample (columns B-H), and a column with the total reads in each OTU (column I). The 10th column onwards (i.e. column J) consists of samples with the number of reads of each OTU detected within.

File: protocol_levels.csv

Description:

This file contains brief metadata associated with the extraction protocols which is used in the corresponding R script to perform analyses.

File: 16S-18S-raw-data.xlsx

Description:

This file contains raw OTU data for Crust16S mtDNA and 18S rDNA datasets, prior to dataset cleaning for the analysis included in the main manuscript.

Code/software

All analysis was performed in the software R (version 4.3.1). The code used can be found in the R script file "Parikh-et-al-2024.R" which lists all the R packages required to perform the analysis as well as brief comments explaining what analyses are being performed. All files and datasets required to run the analysis have been included.

Optimizing DNA extraction protocols for the diet analysis of a baleen whale (Eubalaena australis)

Data files

Abstract

README: Optimizing DNA Extraction Protocols for Diet Analysis of Baleen Whales (Eubalaena australis)

Description of the data and file structure

Supplementary tables and figures (Zenodo)

Code (Zenodo)

Files and variables

File: 16S-asv-table.csv, 18S-asv-table.csv

File: protocol_levels.csv

Description:

File: 16S-18S-raw-data.xlsx

Description:

Code/software

Methods

Change log