Data from: MetaBARFcoding: DNA-barcoding of regurgitated prey yields insights into Christmas Shearwater (Puffinus nativitatis) foraging ecology at Hōlanikū (Kure Atoll), Hawaiʻi
Data files
Nov 08, 2021 version files 92.30 KB
Abstract
Morphological identification of digested prey remains from a generalist predator can be challenging, especially when attempting to match degraded remains to taxonomic keys. DNA techniques, whereby prey is sequenced and matched to large public nucleotide sequence databases, are increasingly being used to augment morphological identification. We used “metaBARFcoding” (DNA metabarcoding) to target a region of the cytochrome c oxidase subunit I mitochondrial gene to identify prey in highly-digested regurgitations from Christmas Shearwaters Puffinus nativitatis at Hōlanikū (Kure Atoll). Metabarcoding was used to bulk-process 92 water samples from regurgitations collected from 2009-2017, providing an overview of the seabird’s diet. We additionally Sanger sequenced 100 prey items from 50 randomly chosen regurgitations to verify that metabarcoding characterized key components of the diet. The metabarcoding technique identified 87 unique taxa from 29 families of fish and squid, spanning diverse taxa, including reef-associated, pelagic-oceanic, and mesopelagic species. Rare prey (frequency of occurrence < 5% of samples) constituted 66% of the species richness, demonstrating the highly diverse diet of this generalist predator. Overall, 81% of the families detected in the contemporary diet were previously documented in Christmas Shearwater diets from the Northwestern Hawaiian Islands. Our results indicate that metabarcoding the cytochrome c oxidase subunit I (COI) region is useful in identifying a wide range of taxa from highly digested regurgitations, thus facilitating this approach to study seabird diets.
Methods
These are dDNA metabarcoded (Illumina MiSeq) and Sanger sequenced individual specimens from regurgitations of Christmas Shearwaters (Puffinus nativitatis) onto the runway of Kure Atoll. They are byproducts of a tagging program - the birds often regurgitate during the tagging process. All information on DNA extraction, PCR amplification, and sequencing can be found in the Supplementary Information of the manuscript.
Usage notes
There are multiple files included here.
1. OTU-BlastMatches-SangerData-CHSH-Metabarfcoding.csv - This file contains information on OTUs (Genus Species, Phylum,Class,Order,Family,TaxID, total score, query%, Evalue, % Identity) of the top hit(s) obtained via BLAST on NCBI of the cytochrome c oxidase subunit I gene region Sanger-sequenced from degraded, but whole specimens picked out of the seabird regurgitations.
2. OTU-BlastMatches-MetabarcodeData-CHSH-Metabarfcoding.csv - This file contains information on OTUs (Genus Species, Phylum,Class,Order,Family,TaxID, total score, query%, Evalue, bitscore, size, Query %, % Identity, ) of the top hit(s) obtained via BLAST on NCBI of the cytochrome c oxidase subunit I gene region (from one of two primer sets covering COI), and a table listing all of the samples that OTU was recorded in for this study.
3. Nimz-etal-CHSHDiet-SupplementaryInformation_09-28-21 - This file is the supplementary information for the manuscript, and describes the data acquisition and processing steps (lab and post-sequence data analysis)
4. GEOME-SampleInfo-PrimerTrial-CHSH-Metabarfcoding.xlsx - This is the metadata stored in GEOME for the bird regurgitations that were "metabarfcoded" for the initial primer trial. The fastq files are stored in the NCBI SRA.
5. GEOME-SampleInfo-DietStudy-CHSH-Metabarfcoding.xlsx - This is the metadata stored in GEOME for the bird regurgitations that were "metabarfcoded" for the full diet study. The fastq files are stored in the NCBI SRA.
6. CHSH-Metabarcode-Diet_file_list.txt - This is just a list of the names of the fastq sequence files (paired-end) that are uploaded into the NCBI SRA from the actual diet study to make them easy to find in that database.
7.CHSH-Metabarcode-PT_file_list.txt - This is just the list of the fastq sequence files (paired-end) that are uploaded into the NCBI SRA from the initial primer test to make them easy to find in that database.