Skip to main content

Data from: Sustained plumage divergence despite weak genomic differentiation and broad sympatry in sister species of Australian woodswallows (Artamus spp.)

Cite this dataset

Peñalba, Joshua; Peters, Jeffrey; Joseph, Leo (2022). Data from: Sustained plumage divergence despite weak genomic differentiation and broad sympatry in sister species of Australian woodswallows (Artamus spp.) [Dataset]. Dryad.


Plumage divergence can function as a strong premating barrier when species come into secondary contact. When it fails to do so, the results are often genome homogenization and phenotypic hybrids at the zone of contact. This is not the case in the largely sympatric masked woodswallow and white-browed woodswallow species (Passeriformes: Artamidae: Artamus spp) complex in Australia where phenotypic integrity is sustained despite no discernible mitochondrial structure in earlier work. This lack of structure may suggest recent divergence, ongoing gene flow or both, and phenotypic hybrids are reported albeit rarely. Here, we further assessed the population structure and differentiation across the species' nuclear genomes using ddRAD-seq. As found in the mitochondrial genome, no structure or divergence within or between the two species was detected in the nuclear genome. This coarse sampling of the genome nonetheless revealed peaks of differentiation around the genes SOX5 and Axin1. Both are involved in the Wnt/β-catenin signaling pathway, which regulates feather development. Reconstruction of demographic history and estimation of parameters supports a scenario of secondary contact. Our study informs how divergent plumage morphs may arise and be sustained despite whole-genome homogenization and reveals new candidate genes potentially involved in plumage divergence.


This dataset resulted from ddRADseq using Sbf1 and EcoR1 followed by a size selection between 300-450bp and sequencing 150bp single-end on an Illumina HiSeq2500 platform. Reads were processed using a pipeline described by DeCosta and Sorenson (2014). For each sample, identical reads were combined into a single read and recoded with the number of reads and the highest quality score for each nucleotide position. Reads with an average Phred score of <20 were removed. Retained reads from all individuals were clustered into putative loci using USEARCH v. 5, with an –id setting of 0.85, and aligned using MUSCLE V. 3. Individuals were genotyped at each locus as described in DaCosta and Sorenson (2014): homozygotes were defined when >93% of the reads were identical, whereas heterozygotes were defined when a second sequence was represented by >29% of reads, or if a second sequence was represented by as few as 10% of reads and the haplotype was confirmed in other individuals. Genotypes were flagged if none of these criteria were met or more than two haplotypes met the criteria. From these flagged genotypes, we retained the allele represented by the majority of reads and scored the second allele as missing data. Similarly, a second allele was scored as missing when the locus was represented by <5 reads. We retained all loci that contained ≤10% missing genotypes and ≤5% flagged genotypes.