Skip to main content
Dryad

Data from: Exploring tree-like and non-tree-like patterns using genome sequences: an example using the inbreeding plant species Arabidopsis thaliana (L.) Heynh.

Cite this dataset

Stenz, Noah; Larget, Bret; Baum, David A.; Ané, Cécile (2015). Data from: Exploring tree-like and non-tree-like patterns using genome sequences: an example using the inbreeding plant species Arabidopsis thaliana (L.) Heynh. [Dataset]. Dryad. https://doi.org/10.5061/dryad.q044d

Abstract

Genome sequence data contain abundant information about genealogical history, but methods for extracting and interpreting this information are not yet fully developed. We analyzed genome sequences for multiple accessions of the selfing plant, Arabidopsis thaliana, with the goal of better understanding its genealogical history. As expected from accessions of the same species, we found much discordance between nuclear gene trees. Nonetheless, we inferred the optimal population tree under the assumption that all discordance is due to incomplete lineage sorting. To cope with the size of the data (many genes and many taxa), our pipeline is based on parallel computing and divides the problem into four-taxon trees. However, just because a population tree can be estimated does not mean that the assumptions of the multispecies coalescent model hold. Therefore, we implemented a new, nonparametric test to evaluate whether a population tree adequately explains the observed quartet frequencies (the frequencies of gene trees with each resolution of each four-taxon set). This test also considers other models: panmixia and a partially resolved population tree, that is, a tree in which some nodes are collapsed into local panmixia. We found that a partially resolved population tree provides the best fit to the data, providing evidence for tree-like structure within A. thaliana, qualitatively similar to what might be expected between different, closely related species. Further, we show that the pattern of deviation from expectations can be used to identify instances of introgression and detect one clear case of reticulation among ecotypes that have come into contact in the United Kingdom. Our study illustrates how we can use genome sequence data to evaluate whether phylogenetic relationships are strictly tree-like or reticulating.

Usage notes