Conflicting phylogenetic signals in genomic data of the coffee family (Rubiaceae)
Data files
Feb 20, 2020 version files 11.16 MB
-
GetSeqByGeneType.py
-
S1_File.nex
-
S2_File.nex
-
S3_File.nex
-
S4_File.nex
-
translator.py
Feb 21, 2020 version files 11.21 MB
-
GetSeqByGeneType.py
-
README.pdf
-
S1_File.nex
-
S2_File.nex
-
S3_File.nex
-
S4_File.nex
-
translator.py
Abstract
Reconstructions of phylogenetic relationships in the flowering plant family Rubiaceae, or the coffee family, have up until now relied heavily on single or multi-gene data, primarily from the plastid compartment. With the availability of cost- and time-efficient techniques for generating complete genome sequences, the opportunity arises to resolve some of the relationships that up until now have proven problematic. Here we contribute new data from complete 58 plastid genome sequences representing 55 of the currently 65 recognized tribes of the Rubiaceae. Also contributed are new data from the nuclear rDNA cistrons for the corresponding taxa. Phylogenetic analyses are conducted on two plastid data sets, one including data from the protein coding genes only with a total of 69,828 aligned characters, and a second where protein coding data are combined with an additional 25,666 aligned characters from non-coding regions, and on a nuclear rDNA data set including 6,045 aligned characters. Our results clearly show that simply adopting a “more characters” approach does not resolve the relationships in the Rubiaceae. More importantly, we identify conflicting phylogenetic signals in the data. Analyses of the same plastid data, treated as nucleotides or as codon degenerated data, resolve and support conflicting topologies in the subfamily Cinchonoideae. As these analyses use the same data, we interpret the conflict to result from erroneous assumptions in the models used to reconstruct our phylogenies. Conflicting signals are also identified in the analyses of the plastid vs. the nuclear rDNA data sets. These analyses use data from different genomic compartments, with different inheritance patterns, and we interpret the conflicts as representing “real” conflicts, reflecting biological processes of the past.