Data from: Phylogenomic analysis of BAC-end sequence libraries in Oryza (Poaceae)
Cranston, Karen A. et al. (2010), Data from: Phylogenomic analysis of BAC-end sequence libraries in Oryza (Poaceae), Dryad, Dataset, https://doi.org/10.5061/dryad.1611
Analyses of genome scale data sets are beginning to clarify the phylogenetic relationships of species with complex evolutionary histories. Broad sampling across many genes allows for both large concatenated data sets to improve genome-scale phylogenetic resolution and also for independent analysis of gene trees and detection of phylogenetic incongruence. Recent sequencing projects in Oryza sativa and its wild relatives have positioned rice as a model system for such “phylogenomic” studies. We describe the assembly of a phylogenomic data set from 800,000 bacterial artificial chromosome (BAC) end sequences, producing an alignment of 2.4 million nucleotides for 10 diploid species of Oryza. A supermatrix approach confirms the broad outline of previous phylogenetic studies, although the non-phylogenetic signal and high levels of missing data must be handled carefully. Phylogenetic analysis of 12 chromosomes and nearly 2000 genes finds strikingly high levels of incongruence across different genomic scales, a result that is likely to apply to other low-level phylogenies in plants. We conclude that there is great potential for phylogenetic inference using data from next-generation sequencing protocols but that attention to methodological issues arising inevitably in these data sets is critical.