Data and code from: DNA metabarcoding reveals dietary divergence among sympatric swallows and flycatchers
Data files
Dec 18, 2025 version files 2.15 MB
-
2022CivilTwilight.csv
2.39 KB
-
Birds2022.csv
20.50 KB
-
Diet2022.csv
2.10 MB
-
FecalSampleTransfer.csv
8.58 KB
-
LarvalHabitat.csv
7 KB
-
README.md
11.10 KB
Abstract
Aerial insectivore (AI) populations have been in steep decline in North America since the 1970s, with swallows, swifts, and nightjars declining more rapidly than flycatchers. As AI share a common diet of flying insects, reductions in insect abundance are likely one of the major factors driving population decline. Previous studies have shown major dietary differences between swallows and flycatchers; flycatchers have exhibited more diverse, generalist diets than swallows. However, no study has directly compared the diets of sympatric swallows and flycatchers using the same method of dietary analysis. To investigate these differences, we compared the diets of six AI species living in sympatry during the breeding season. We collected fecal samples from adult Riparia riparia (Bank Swallow), Hirundo rustica (Barn Swallow), Petrochelidon pyrrhonota (Cliff Swallow), Tachycineta bicolor (Tree Swallow), Empidonax alnorum (Alder Flycatcher), and E. minimus (Least Flycatcher). We used DNA metabarcoding to identify the taxonomic composition of invertebrates in the feces and compared the richness of genera by insect order, insect family, and dipteran family between all species. Through a Bray-Curtis distance-based redundancy analysis, we identified significant differences in dietary composition between bird species at all three levels; however, the greatest amount of dissimilarity is seen in the dipterans consumed. E. alnorum, E. minimus, H. rustica, and T. bicolor had broader, more generalist diets than P. pyrrhonota and R. riparia. By comparing the diets between multiple species living in sympatry, our study improves our understanding of a possible cause of disproportionate population declines observed among AI species.
https://doi.org/10.5061/dryad.573n5tbj3
Description of the data and file structure
Data package includes 5 comma-separated values files:
- Birds2022.csv
- Diet2022.csv
- FecalSampleTransfer.csv
- 2022CivilTwilight.csv
- LarvalHabitat.csv
Files and variables
File: PRN_Data_Package_MS_24_088.zip
Description:
Data file 'Birds2022.csv' contains all banding data and fecal sample bag IDs, with the following variables:
⦁ Bag_ID: Individual marker, when combined with species, differentiating the fecal samples collected in paper bags. Must be combined with species when linking to the secondary fecal sample IDs in the data files 'FecalSampleTransfer.csv' and 'Diet2022.csv'.
⦁ Bander: Initials of the personnel who applied the CWS issued aluminum band for individual identification of birds.
⦁ Code: Status of band application (N = New band, U = Released without band, R = recaptured (pre-existing band recorded))
⦁ Band: The individual identifying number of the CWS issued aluminum band either newly applied or already on the bird.
⦁ Tag: Individual number of a Lotek radio tag, applied to some birds in collaboration with another research project.
⦁ Species: Four letter code designating the bird's species (BANS = R. riparia, Bank Swallow; BARS = H. rustica, Barn Swallow; CLSW = P. pyrrhonota, Cliff Swallow; TRES = T. bicolor, Tree Swallow; ALFL = E. alnorum, Alder Flycatcher; LEFL = E. minimus, Least Flycatcher; TRFL = either E. alnorum or E. traillii (Willow Flycatcher).
⦁ Age: Age code of bird (AHY = After hatch year; HY = Hatch year, SY = Second year)
⦁ How: How age was determined (PL = Plumage, BP = presence of brood patch, CA = calendar year/season, CP = presence of cloacal protuberance, ML = Molt limit)
⦁ Sex: Sex of bird (F = Female, M = Male, U = Unknown)
⦁ How: How sex was determined (EG = presence of egg in oviduct, CP = presence of cloacal protuberance, BP = presence of brood patch, X = discriminant function, WL = wing length, IC = inconclusive/conflicting evidence, ACP = Adult plumage and presence of cloacal protuberance, TCP = Tail length and presence of cloacal protuberance)
⦁ BP: score from 0-5 indicating stage of brood patch (peak breeding at BP = 3)
⦁ CP: score from 0-3 indicating stage of cloacal protuberance (peak breeding at CP = 3)
⦁ Fat: score from 0-7 of fat deposition on the body
⦁ Wing: natural wing chord in mm
⦁ Tail: length of tail in mm
⦁ Weight: weight of bird in grams
⦁ Date: Date of sample
⦁ Time: Time of sample
⦁ Site: Name of the general location at which the bird was sampled
⦁ Wing_flat: Flattened wing chord in mm
⦁ Wing-Tail: Difference between wing chord and tail length in mm
⦁ Bill_Length_N_T: Chord between the anterior nares and tip of the bill, measured in mm
⦁ Bill_Width_Anterior_Nares: Width of bill measured at the anterior-most point of the nares, in mm
⦁ Longestp-p6: distance between the tip of the longest primary and the sixth primary, in mm
⦁ p6-p10: distance between the tip of the sixth primary and the tenth primary, in mm
⦁ p10-p5: distance between the tenth primary and the fifth primary, in mm
⦁ p9-p5: distance between the ninth primary and the fifth primary, in mm
⦁ p6_emarg: Whether or not the sixth primary was clearly emarginated (Y), slightly emarginated (S), or not emarginated (N).
⦁ Song_evidence: Whether or not the bird (flycatchers only) was observed singing before capture. (L = Likely, Y = Yes, N = No).
⦁ Formula R: The value of the following function: [(Longestp-p6 + p9-p5 + Wing-Tail) / (p6-p10 + Bill_Length_N_T)] as defined in Pyle 1997 for the separation of E. alnorum and E. traillii.
⦁ Sex_Discriminant: The value of the following discriminant function: [70.9 - 0.27*(Longestp-p6 - p10-p5)] as defined in Pyle 1997 for the separation of female and male E. alnorum by flattened wing chord.
⦁ Notes: Additional comments regarding the condition or sampling of each bird.
Missing values are empty cells.
Data file 'Diet2022' contains the invertebrate DNA detections from each fecal sample, via DNA metabarcoding techniques, after algorithmic sorting to omit false detections and include only high confidence (>100 reads and 95% reference match) detections. Pertinent variables include:
⦁ PlateOrRun: "V4_JNocera_Fecal" denotes the collection of samples, the customer, and sample type for the purposes of the Canadian Centre for DNA Metabarcoding (CCDB) in Guelph, Ontario, Canada.
⦁ Sample: Identifying code for each sample and sample replicate (e.g. NGSFA-00163_100A_Rep1 represents replicate 1 of sample 100A).
⦁ ContigName: Identification of contig used in DNA sequencing process by CCDB.
⦁ ReadCount: Number of reads detected for a given DNA sequence in the sample. A greater number of reads equates to a greater confidence of detection.
⦁ Phylum: The taxonomic phylum of the identified DNA sequence.
⦁ Class: The taxonomic class of the identified DNA sequence.
⦁ Order: The taxonomic order of the identified DNA sequence.
⦁ Family: The taxonomic family of the identified DNA sequence.
⦁ Genus: The taxonomic genus of the identified DNA sequence.
⦁ Species: the taxonomic species of the identified DNA sequence.
⦁ SeqName: Unique identifier for the DNA sequence read.
⦁ Sequence: The actual base-pair sequence of the detected DNA.
⦁ TaxAssign: The full taxonomic assignment for the DNA sequence as produced by the sequence pairing algorithm.
⦁ TaxAssignFinal: The conservative taxonomic assignment for the DNA sequence adjusted to minimum confident assignment.
Missing values are given as 'NA'.
Data file 'FecalSampleTransfer.csv' gives the matching fecal sample IDs from bag to container, including species, site, date of sample collection, date of sample transfer, and the estimated amount of sample that was collected. Samples with no or too little volume did not receive a container 'Sample ID' and were not sent for DNA metabarcoding.
Missing values are empty cells.
Data file '2022CivilTwiling' contains the time and date of civil twilight, sunrise, and local noon from May 26 to August 6 of the year 2022, for the city of Sackville, NB (-64.390, 45.920) obtained from the National Research Council of Canada.
Data file 'LarvalHabitat' contains the habitat category of adult and larval forms for invertebrate families detected in fecal samples. Specific cases are dealt with within the R script, as well as generalization of all types into either "terrestrial" or "aquatic."
All data files were prepared by Pat Nancekivell of the University of New Brunswick. Any data inquiries should be forwarded to Pat Nancekivell at pnanceki@unb.ca.
We carried out all statistical analyses using the R statistical software (v4.4.1; R Core Team 2023).
Software file 'PRN_RScript_2024.R' contains all necessary R code to derive the reported results. The packages used include:
⦁ labdsv v2.1-0 (Robert DW 2023)
⦁ lubridate v1.9.3 (Grolemund and Wickham 2011)
⦁ stringr v1.5.1 (Wickham 2023)
⦁ tidyverse v2.0.0 (Wickham et al. 2019)
⦁ vegan v2.6-4, (Oksanen et al. 2022)
The script (PRN_RScript_2024.R) is separated into the following sections:
⦁ Tidy Bird Banding Data: Involves isolating important variables, calculating and amending the minutes from civic twilight variable, defining the habitat type for each sampling site, and converting all "TRFL" codes to "ALFL" for the purposes of the analysis, as described in the accompanying manuscript.
⦁ Tidy Diet Data: Involves extracting the sample ID and replicate names, removing detections of incidental/non-significant dietary items (i.e., fleas, rotifers, nematodes, and ticks), removing coarse detections (detections without genus information), combining samples that were halved during preparation, removing samples from recently recaptured birds (within 14 days), amending larval habitat information, and calculating richness by replicate, and keeping the replicate with the highest richness.
⦁ Calculate Richness by Taxa Level: Calculate the genera richness by three levels: order, family, and dipteran-family.
⦁ Matrices: Create two matrices for each taxonomic level: "matrix_diet" includes the number of genera, with each column representing a taxon; "matrix_ex" includes the values of the explanatory variables.
⦁ dbRDAs: At each taxonomic level, a Bray-Curtis dissimilarity matrix is created using the vegdist() function. Global and intercept models are defined, and then forward selection via permutations (ordiR2step) is used to select a best-fit model by optimized adjusted-R2.
⦁ dbRDAs-no-BARS: The previous section is repeated, after removal of the Barn Swallow samples.
⦁ PCoAs: For the purposes of graphical figures, the principle coordinate scores are calculated at each taxonomic level.
⦁ Frequency of Occurrence (FOO): The frequency of occurrence is calculated for each taxonomic level for each species. Results are reported in both Figure 2 and 3 of the main text, and, in full, in Supplementary Table 2.
This R script was written by Pat Nancekivell. Any inquiries regarding the function or commenting of the code should be directed to Pat Nancekivell at pnanceki@unb.ca.
Supplementary files include 2 figures (Supplementary_Figure_1.pdf, Supplementary_Figure_2.pdf), their associated captions (Supplementary_Figure_Captions.docx), and all supplementary tables (Supplementary_Tables.docx).
⦁ Supplementary_Figure_1.pdf: Number of fecal samples obtained from each species by date of sampling, with reference to theoretical breeding phenology given by literature (Harrison 1978, Peck and James 1987, McCabe 1991, Imlay et al. 2018, Brown and Brown 2020, Brown et al. 2020). More information is given in "Supplementary_Figure_Captions.docx".
⦁ Supplementary_Figure_2.pdf: Principal coordinate analyses (scaling = 1) of the Bray-Curtis dissimilarity matrix of dipteran families detected in avian fecal samples via DNA metabarcoding. Samples are colour-scaled by (A) time in minutes from civil twilight to time of sample, and (B) ordinal date of fecal collection. More information given in "Supplementary_Figure_Captions.docx".
⦁ Supplementary_Figure_Captions.docx: A single document containing thumbnails and figure captions for supplementary figures 1 and 2, as well as the supplementary literature cited.
⦁ Supplementary_Tables.docx: A single document containing supplementary tables 1 and 2, as well as their table captions. Supp. Table 1 describes the number of fecal samples from each consumer species by site and habitat classification. Supp. Table 2 gives the frequency of occurrence (FOO) for all invertebrate genera, families, and orders detected in the fecal samples of each consumer species. Further details are described in the table captions within the file.
We caught adult birds of Hirundo rustica (Barn Swallow), Riparia riparia (Bank Swallow), Petrochelidon pyrrhonota (Cliff Swallow), Tachycineta bicolor (Tree Swallow), Empidonax minimus (Least Flycatcher), and E. alnorum (Alder Flycatcher) during the 2022 breeding season in New Brunswick, Canada. Birds defecated in paper bags, banded (with morphology data such as age, sex, wing chord, etc. also taken), and released.
Fecal samples were stored at -20 °C until packaged in 1.5ml microcentrifuge tubes and sent to the Canadian Centre for DNA Barcoding (CCDB) at the University of Guelph (Guelph, ON) for DNA metabarcoding analysis. DNA fragments extracted from each sample were amplified using arthropod-specific primers described by Zeal et al. (2011), which target a 157 base-pair section of the mitochondrial cytochrome c oxidase subunit 1 (COI) gene. Two polymerase chain reaction (PCR) replicates were performed for each sample. The amplified DNA was sequenced alongside negative controls via next-generation sequencing and assigned identities by comparing reads to the Barcode of Life Data System (BOLD) reference library using the basal local alignment search tool (BLAST) algorithm. Only sequences with at least a 95 % match across 100 base pairs with the reference sequence and a minimum of 100 reads were considered confident detections. We only included detections with identification to genus or species in our analyses.
Statistical analyses included calculation of frequency of occurrence (FOO) and Bray-Curtis dissimilarity matrices at the level of invertebrate order, family, and dipteran family. We then used distance-based redundancy analyses to assess the effects of consumer species, Julian date of sampling, time from civil twlight, and breeding habitat type on the dietary dissimilarity matrices.
