Skip to main content
Dryad

Data from: Types, levels, and patterns of low-copy DNA sequence divergence, and phylogenetic implications, for Gossypium genome types

Data files

Oct 06, 2011 version files 436.49 KB

Abstract

To explore types, levels, and patterns of genetic divergence among diploid Gossypium (cotton) genomes, 780 cDNA, genomic DNA, and SSR loci were re-sequenced in Gossypium herbaceum (A1 genome), G. arboreum (A2), G. raimondii (D5), G. trilobum (D8), G. sturtianum (C1) and an outgroup, Gossypioides kirkii. Divergence among these genomes ranged from 7.32 polymorphic base pairs per 100 between G. kirkii and G. herbaceum (A1) to only 1.44 between G. herbaceum (A1) and G. arboreum (A2). SSR loci are least conserved with 12.71 polymorphic base pairs and 3.77 polymorphic sites per 100 base pairs, while ESTs are most conserved with 3.96 polymorphic base pairs and 2.06 sites. SSR loci also exhibit the highest percentage of 'extended polymorphisms' (spanning multiple consecutive nucleotides). The A genome lineage was particularly rapidly evolving, with the D genome also showing accelerated evolution relative to the C genome. Unexpected asymmetry in mutation rates was found, with much more transition than transversion mutation in the D genome after its divergence from a common ancestor shared with the A genome. This large quantity of orthologous DNA sequence strongly supports a phylogeny in which A-C divergence is more recent than A-D divergence, a subject that is of much importance in view of A-D polyploid formation being key to the evolution of the most productive and finest-quality cottons. Loci that are monomorphic within A or D genome types, but polymorphic between genome types, may be of practical importance for identifying locus-specific DNA markers in tetraploid cottons including leading cultivars.