Skip to main content

Data from: Complete genome sequences provide a case study for the evaluation of gene-tree thinking


Dikow, Rebecca B.; Smith, William Leo (2013), Data from: Complete genome sequences provide a case study for the evaluation of gene-tree thinking, Dryad, Dataset,


Complete genome sequences from a genus of Gammaproteobacteria, Shewanella, are used to generate a genome-wide exploration of the gene-tree species-tree dichotomy. A number of datasets were constructed and analyses were attempted. Single genes were chosen from 243 regions of collinear gene homology (128 of these 243 chosen genes are from the core Shewanella genome and 162 of 243 have the complete taxon sampling) from a previous study (Dikow, 2011) and subjected to phylogenetic analysis both individually and concatenated. In addition, three of the 243 sets of collinear genes from the core Shewanella genome were also chosen (comprising 15, 17, and 23 genes each) to be analysed in detail, this time to maximize the expectation of gene concordance. Analysis of these 55 genes in maximum parsimony (MP) and maximum likelihood (ML) produced 164 unique topologies (out of 166 resulting topologies). No genes from within collinear regions were congruent with one another, and none of these 164 topologies matches the result from concatenation. This result is particularly striking given that we chose collinear sets of genes. Analyses in MP and ML of 243 genes distributed across the genome produced 567 unique topologies (out of 571 resulting topologies for those 162 genes with complete taxon sampling). These results are discussed in light of recent works focused on incongruence. The gene as a phylogenetic unit is also discussed. It is our conclusion that molecular systematics has been reliant on the gene as a unit without a critical eye on the distinction between gene homology and character homology.

Usage notes