Species delimitation beyond phylogenomics: integrative approaches reveal gentoo penguin speciation
Data files
May 19, 2023 version files 49.15 MB
-
morphological_data.xlsx
-
phylogenies_datasets.zip
-
README.md
-
SDM.zip
-
submission_genbank.xlsx
Abstract
Isolation and adaptation to new environments are important steps for reproductive isolation and consequently speciation. Seabirds have low phenotypic variation along their ranges in the absence of clear geographic or environmental barriers to dispersal. Despite the lacking visible phenotypic differences, the number of taxa for the gentoo penguin (Pygoscelis papua, Forster 1781) in the Southern Ocean has been under debate for the last decade, ranging from one to six different taxa. Here, we provide several lines of evidence from genomics, ecology, morphological data, and a complete systematic review that supports four distinctive gentoo penguin species, including the description of a new species. We also provide future niche projections for each of these species. Gentoo penguin genomes (n = 64) recover four main lineages: the northern gentoo (from South America), the southern gentoo (Antarctic Peninsula and maritime Antarctica, south of the Antarctic Polar Front, APF), the southeastern gentoo (from Kerguelen Islands), and the eastern gentoo (colonies located at lower latitudes north of the APF). Our analysis of selection across the genome recovered between 42 and 101 genes under selection for each of the four species, demonstrating that the four species are experiencing differing selective pressures that have caused them to diverge adaptively. The function of these genes affects traits that include reproduction, thermoregulation, osmoregulation, feed efficiency, and morphological variation. Morphological data were taken from museum individuals of all lineages, including from South Georgia gentoos, which have previously been considered a distinct taxon. Multivariate morphological comparisons of all pairs of lineages showed that the northern, southern, southeastern, and South Georgia gentoo penguins are morphologically distinct from each other (p < 0.05 for all pairwise comparisons), while the eastern lineage is intermediate in size and overlaps in morphospace with other lineages. This result also suggests that body size across latitudes is in direct contrast to Bergmann’s rule. Here, we describe the southeastern gentoo penguin from Kerguelen Island and confirm the taxonomic rank of gentoos from Macquarie Island and South Georgia Island as subspecies. Species distribution modelling suggests that climate change will expand the favourable space for the southern range expansion of the southern gentoo penguin but would result in a net loss of suitable habitats for compensatory niche shift relocation for the northern and southeastern gentoos. Despite this, amongst the three subantarctic species, the northern and southeastern gentoos possess high neutral and adaptive genetic diversity, including genes related to cold and heat response. This may represent a higher potential to evolve under environmental changes compared with the eastern gentoo penguin; therefore, the future resilience of each species remains uncertain. This study reinforces the urgent need for explicit recognition and protection of the four regional gentoo species based on their genetic, morphological, and ecological distinctiveness.
Methods
The whole genomes of gentoo penguins were resequenced from 60 individuals and four GenBank genomes (Vianna et al. 2020) were included to complement the number of samples per breeding colony. The genome data set contains individuals distributed within the four main evolutionary lineages previously identified. This study contains several lines of evidence, from genomics, ecology, morphology, and population trends and future niche projections that support the existence of four distinctive gentoo penguin species around the Southern Ocean.
The bioinformatic processing of genomic data includes cleaning of low-quality reads and remotion of adapters. Fastq trimmed reads were mapped to the reference genome (GenBank access GCA_010090195.1) and BAM files were generated using BWA-mem2 and processed using GATK. Variant calling was performed using bcftools mpileup/call and after SNPs processing, a fasta consensus file was generated using bcftools consensus. From the fasta files, CDS were extracted with gffread using genomic coordinates from the reference genome. In addition, UCE were extracted using PHYLUCE. Each genomic marker was aligned using MAFFT and gene/species trees were constructed using the RAxML-ASTRAL III pipeline. Additionally, a subset of CDS was used to infer the divergence time using StarBEAST2. For this, a subset of genes was extracted using the approach described by Koch (2021), which calculates multiple gene properties for a range of phylogenomic datasets and predicts their patterns of covariance, enabling the ordering of loci by their putative rate of evolution and their relative phylogenetic informativeness.
Differences in morphology were explored from museum individuals from different locations around the Southern Ocean, covering all the lineages and subpopulations, and were performed using lineal morphology information.
Based on previous studies of niche modelling of gentoo penguin habitats, we used occurrence data and relevant bioclimatic variables defined by Pertierra et al. (2020) to perform future niche projections for each species.
Usage notes
Beast2, Figtree