Data from: Whole genome duplication in coast redwood (Sequoia sempervirens) and its implications for explaining the rarity of polyploidy in conifers
Scott, Alison Dawn, University of Wisconsin-Madison
Stenz, Noah W. M., University of Wisconsin-Madison
Ingvarsson, Pär K., Umeå University
Baum, David A., University of Wisconsin-Madison
Published Feb 17, 2017 on Dryad.
Cite this dataset
Scott, Alison Dawn; Stenz, Noah W. M.; Ingvarsson, Pär K.; Baum, David A. (2017). Data from: Whole genome duplication in coast redwood (Sequoia sempervirens) and its implications for explaining the rarity of polyploidy in conifers [Dataset]. Dryad. https://doi.org/10.5061/dryad.7nb70
Polyploidy is common and an important evolutionary factor in most land plant lineages, but it is rare in gymnosperms. Coast redwood (Sequoia sempervirens) is one of just two polyploid conifer species and the only hexaploid. Evidence from fossil guard cell size suggests that polyploidy in Sequoia dates to the Eocene. Numerous hypotheses about the mechanism of polyploidy and parental genome donors have been proposed, based primarily on morphological and cytological data, but it remains unclear how Sequoia became polyploid and why this lineage overcame an apparent gymnosperm barrier to whole-genome duplication (WGD). We sequenced transcriptomes and used phylogenetic inference, Bayesian concordance analysis and paralog age distributions to resolve relationships among gene copies in hexaploid coast redwood and close relatives. Our data show that hexaploidy in coast redwood is best explained by autopolyploidy or, if there was allopolyploidy, it happened within the Californian redwood clade. We found that duplicate genes have more similar sequences than expected, given the age of the inferred polyploidization. Conflict between molecular and fossil estimates of WGD can be explained if diploidization occurred very slowly following polyploidization. We extrapolate from this to suggest that the rarity of polyploidy in gymnosperms may be due to slow diploidization in this clade.
This file contains filtered transcriptome assemblies for S. sempervirens, S. giganteum, M. glyptostroboides, and T. occidentalis. These files serve as input for our concordance analysis pipleline located at https://github.com/nstenz/toca
This directory contains .fasta alignments and PAML (codeml) outputs. Files are split between a 2-copy and 3-copy directory based on the number of putative homeologous sequences in hexaploid S. sempervirens.
Trinity assemblies for four taxa, both with and without EviGene filtering