Data from: ‘In and out of’ the Qinghai-Tibet Plateau and the Himalayas: centers of origin and diversification compared across five clades of Eurasian montane and alpine passerine birds
Data files
Aug 14, 2020 version files 1.96 MB
Abstract
Encompassing some of the major hotspots of biodiversity on Earth, large mountain systems have long held the attention of evolutionary biologists. The region of the Qinghai-Tibet Plateau (QTP) is considered a biogeographic source for multiple colonization events into adjacent areas including the northern Palearctic. The faunal exchange between the QTP and adjacent regions could thus represent a one-way street (‘out of’ the QTP). However, immigration into the QTP region has so far received only little attention, despite its potential to shape faunal and floral communities of the QTP. In this study, we investigated centers of origin and dispersal routes between the QTP, its forested margins and adjacent regions for five clades of alpine and montane birds of the passerine superfamily Passeroidea (Johansson et al., 2008; Selvatti et al., 2015). We performed an ancestral area reconstruction using BioGeoBEARS and inferred a time-calibrated backbone phylogeny for 279 taxa of Passeroidea. The oldest endemic species of the QTP was dated to the early Miocene (ca. 18 Ma). Several additional QTP endemics evolved in the mid to late Miocene (12–7 Ma). The inferred centers of origin and diversification for some of our target clades matched the ‘out of Tibet hypothesis’ or the ‘out of Himalayas hypothesis’ for others they matched the ‘into Tibet hypothesis’. Three radiations included multiple independent Pleistocene colonization events to regions as distant as the Western Palearctic and the Nearctic. We conclude that faunal exchange between the QTP and adjacent regions was bidirectional through time, and the QTP region has thus harbored both centers of diversification and centers of immigration.
Methods
Our sequence data set included four loci: the mitochondrial cytochrome-b (cytb) and NADH dehydrogenase subunit 2 (ND2), as well as the nuclear introns ornithine decarboxylase (ODC) intron 7, and myoglobin (myo) intron 2. Sequences for target taxa were compiled from previous studies. We added missing species and filled gaps in sequence coverage with newly generated sequences. To have all families of Passeroidea represented by at least one species, we included further sequences from GenBank. Newly generated sequences were joined with sequences obtained from studies and sorted using the R package ‘ape’. Sequences were aligned with the stand-alone version of MAFFT 7.273 with automatic selection of the appropriate alignment strategy, the scoring matrix set to '200PAM / κ=2', and the gap opening penalty set to 1.53 (with a gap extension penalty of 0.123). The obtained sequence alignments were manually checked for errors. The final data set included 281 taxa.
Phylogenetic inference and divergence-time estimation were performed with BEAST v1.8.2. We partitioned our dataset in accordance with the best-fitting partitioning scheme that resulted from PartitionFinder v1.1.1 with all site models unlinked and one clock model for each gene resulting in a total of eight sites and four clock models. We linked all tree models to one tree model and a birth-death tree prior was applied. As initial condition for the BEAST run, we supplied a starting tree calculated with RAxML v8.2.0 . Search for the best-known likelihood tree was performed with 100 replicates, bootstrap values obtained from a thorough bootstrap run using the autoMRE option were annotated onto the best-known likelihood tree. All analyses were performed using the same partitioning scheme as applied for BEAST analyses with the GTRGAMMA model applied to all partitions. The BEAST run was performed using the BEAGLE v2.3 library with a chain length of 1.1 x 108 generations with trees being sampled every 10,000 generations. Trees were summarized with TreeAnnotator v1.8.2, where median heights were annotated to the maximum clade credibility (MCC) tree. We used eight calibration points in order to obtain estimates for node ages. Additionally, we applied a normal prior to the root age to avoid the occurrence of implausibly old root ages.
We used R package BioGeoBEARS for ancestral range reconstruction. The maximum clade credibility tree obtained from BEAST was used as input tree (we kept the waxwing, Bombycilla garrulus, as the closest outgroup of Passeroidea and pruned all further outgroups used for fossil calibration from the input tree). We fitted the dispersal, extinction and cladogenesis (DEC) model to the time-calibrated Passeroidea tree (for further details, see Appendix S5). We used a “dispersal multipliers” matrix, allowing dispersal between all areas, favoring adjacent ones (1.0), but penalizing slightly (0.5) dispersal between non-adjacent areas interconnected by a third one (in a land continuum), and penalizing strongly (0.01) long distance dispersal between very distant continents (e.g. Europe and the New World).
Usage notes
This dataset includes:
- 4 sequence alignments of cytochrome-b, NADH dehydrogenase subunit 2, ornithine decarboxylase (ODC) intron 7 and myoglobin (myo) intron 2
- the list of taxa included in the study including information on origin of samples and GenBank accession numbers
- the time-calibrated maximum clade credibility tree of Passeroidea derived from Bayesian phylogenetic inference with BEAST
- the area matrices for 279 taxa used for biogeographic analyses with BioGeoBEARS (with 8 and 9 areas, respectively)
- the dispersal multiplier matrices used for biogeographic analyses with BioGeoBEARS (with 8 and 9 areas, respectively)