Genetic confirmation of an “Uncommon Mourningthroat” (Geothlypis philadelphia x G. trichas): a rare but persistent hybrid warbler
Data files
Jun 23, 2025 version files 147.20 KB
-
for-dryad.zip
145.56 KB
-
README.md
1.64 KB
Abstract
For much of the history of American ornithology, hybrid birds discovered outside of known hybrid zones have presented an identification challenge—given the likely parental species are typically identified based on inferences from plumage patterns—but also a source of insight into evolutionary processes. With the advance of modern genetics, the tools are now available to definitively identify hybrids for which genetic samples are available and, in many cases, determine the identity and the sex of each parent species as well as F1 versus backcross status. Typically, these kinds of hybrids are rare; however, one apparently common hybrid wood-warbler (family Parulidae) are putative crosses between Mourning Warblers (Geothlypis philadelphia) and Common Yellowthroats (G. trichas), which have been reported at least thirteen times between 1955 and 2024, though never confirmed genetically. Here we describe the results of genetic and plumage analysis to identify a putative G. philadelphia x G. trichas hybrid captured in July, 2017 in New York, USA. We used genotype likelihoods derived from whole-genome sequencing data of the putative hybrid and several species in Geothlypis and Oporornis (the most likely genera of the parents) to confirm bird’s identity as an F1 hybrid between G. philadelphia and G. trichas. We sequenced one mitochondrial gene and used publicly available sequences of several related species to identify the maternal parent as G. philadelphia. The data and code presented here and in GenBank are intended to permit reproduction of the analyses that made up the study, including processing reads, making trees, and running PCA and admixture.
Dataset DOI: 10.5061/dryad.nvx0k6f3n
Description of the data and file structure
Data
-
bam.list - Example input to to PCAngsd, sample order is unchanged.
-
bam.2.list - Example input to NGSAdmix, sample order is unchanged. Both bamlist sample names follow those found in NCBI BioProject PRJNA630247.
-
moye.gl1.cov - The covariance matrix, output of PCAngsd. The eigenvectors can be calculated using the eigen() command in R. The order of individuals is the same as bam.list.
-
moye_maf0.2_k2_gl1.txt - Admixture proportions of each individual. The order is the same as bam.2.list.
-
sanger1.ab1 - The forward Sanger trace.
-
sanger2.ab1 - The reverse Sanger trace.
-
geothlypis_coi_unaligned.fasta - The COI sequences from the hybrid and each Geothlypis/Oporornis species as downloaded from GenBank.
-
geothlypis_coi_aligned.fasta - The aligned fasta of the above after trimming to the maximum common length across all samples.
Code
(All code is run on the command line, just change the directories to your own)
-
trim-align-sort-mark-index.sh - Code that takes demultiplexed raw reads and a reference genome, then trims adapters, aligns to the reference, marks duplicates, and indexes the resulting bam file.
-
gls-pca-admix.sh - Code to calculate genotype likelihoods from bam files, then run the PCA and admixture analyses.
-
iqtree_geothlypis_coi.sh - Code that makes a COI tree from the alignment geothlypis_coi_aligned.fasta.