Effect of different types of sequence data on palaeognath phylogeny
Data files
Mar 28, 2023 version files 731.03 MB
-
CDS_c12_all.tar.gz
-
CDS_c12_norecomb_xgG.tar.gz
-
CDS_c12_norecomb_xsC.tar.gz
-
CDS_c12_norecomb_xsCxgG.tar.gz
-
CDS_c12_norecomb.tar.gz
-
CDS_c12_xgG.tar.gz
-
CDS_c12_xsC.tar.gz
-
CDS_c12_xsCxgG.tar.gz
-
CDS_c123_all.tar.gz
-
CDS_c123_norecomb_xgG.tar.gz
-
CDS_c123_norecomb_xsC.tar.gz
-
CDS_c123_norecomb_xsCxgG.tar.gz
-
CDS_c123_norecomb.tar.gz
-
CDS_c123_xgG.tar.gz
-
CDS_c123_xsC.tar.gz
-
CDS_c123_xsCxgG.tar.gz
-
CDS_c3_all.tar.gz
-
CDS_c3_norecomb_xgG.tar.gz
-
CDS_c3_norecomb_xsC.tar.gz
-
CDS_c3_norecomb_xsCxgG.tar.gz
-
CDS_c3_norecomb.tar.gz
-
CDS_c3_xgG.tar.gz
-
CDS_c3_xsC.tar.gz
-
CDS_c3_xsCxgG.tar.gz
-
CNEE_all.tar.gz
-
CNEE_norecomb_all.tar.gz
-
CNEE_norecomb_xgG.tar.gz
-
CNEE_norecomb_xsC.tar.gz
-
CNEE_norecomb_xsCxgG.tar.gz
-
CNEE_xgG.tar.gz
-
CNEE_xsCxgG.tar.gz
-
CNEEs_xsC.tar.gz
-
intron_all.tar.gz
-
intron_norecomb_all.tar.gz
-
intron_norecomb_xgG.tar.gz
-
intron_norecomb_xsC.tar.gz
-
intron_norecomb_xsCxgG.tar.gz
-
intron_xgG.tar.gz
-
intron_xsC.tar.gz
-
intron_xsCxgG.tar.gz
-
README.md
-
UCE_all.tar.gz
-
UCE_norecomb_xgG.tar.gz
-
UCE_norecomb_xsC.tar.gz
-
UCE_norecomb_xsCxgG.tar.gz
-
UCE_norecomb.tar.gz
-
UCE_xgG.tar.gz
-
UCE_xsC.tar.gz
-
UCE_xsCxgG.tar.gz
Abstract
Palaeognathae consists of five groups of extant species: flighted tinamous (1) and four flightless groups: kiwi (2), cassowaries and emu (3), rheas (4), and ostriches (5). Molecular studies supported the groupings of extinct moas with tinamous and ele- phant birds with kiwi as well as ostriches as the group that diverged first among the five groups. However, phylogenetic re- lationships among the five groups are still controversial. Previous studies showed extensive heterogeneity in estimated gene tree topologies from conserved nonexonic elements, introns, and ultraconserved elements. Using the noncoding loci to- gether with protein-coding loci, this study investigated the factors that affected gene tree estimation error and the relation- ships among the five groups. Using closely related ostrich rather than distantly related chicken as the outgroup, concatenated and gene tree–based approaches supported rheas as the group that diverged first among groups (1)–(4). Whereas gene tree estimation error increased using loci with low sequence divergence and short length, topological bias in estimated trees oc- curred using loci with high sequence divergence and/or nucleotide composition bias and heterogeneity, which more occurred in trees estimated from coding loci than noncoding loci. Regarding the relationships of (1)–(4), the site patterns by parsimony criterion appeared less susceptible to the bias than tree construction assuming stationary time-homogeneous model and sug- gested the clustering of kiwi and cassowaries and emu the most likely with ∼40% support rather than the clustering of kiwi and rheas and that of kiwi and tinamous with 30% support each.
Methods
Sequence data of Cloutier et al. (2019) and Sackton et al. (2019) were downloaded from Dryad Digital Repository.
From Cloutier et al. (2019)'s data locus data in which branch lengths from the common ancestral node of palaeognaths except ostrich were 5 times longer than those of concatenated sequences were excluded in introns and UCE.
From Sackton et al.'s (2019) expanded data, loci with all the five groups of palaegnaths [(1) kiwi, (2) cassowary and emu, (3) rheas, (4) tinamous and moa, (5) ostrich] and chicken were extracted. Loci in which there were long branches to species from the common ancestral node of palaeognaths excluding the outgroup ostrich (NCA; see fig. 1a) (> 1) and the branch length of a species was 5 times longer than those of other species in the same group were also excluded. To detect positively selected loci, the likelihood ratio test was carried out for site models M1a (nearly neutral) vs. M2a (positive selection) and M8 (positive selection) vs. M8a (dN/dS = 1) by using codeml of PAML 4.9j (Yang 2007). When at least either of the results of the likelihood ratio test and the BEB method was significant at 1% level, the locus was excluded.
For the remaining loci from Cloutier et al. (2019) and Sackton et al. (2019) recombination was detected by using 3SEQ version 1.7 build 170612 (Lam et al. 2018). Loci in which recombination was detected between any pair of the five groups of paleognaths and chicken were excluded.
Cloutier A, Sackton TB, Grayson P, Clamp M, Baker AJ, Edwards SV. 2019. Whole-genome analyses resolve the phylogeny of flightless birds (Palaeognathae) in the Presence of an empirical anomaly zone. Syst Biol 68:937-955.
Sackton TB, Grayson P, Cloutier A, Hu Z, Liu JS, Wheeler NE, Gardner PP, Clarke AJ, Baker AJ, Clamp M, Edwards SV. 2019. Convergent regulatory evolution and loss of flight in paleognathous birds. Science 364: 74-78.
Usage notes
"tar command" was used to make archive files of sequence data used.
All data are in text format. They can be opened by text editor.