Skip to main content

Data from: Patterns and evolution of nucleotide landscapes in seed plants

Cite this dataset

Serres-Giardi, Laurana; Belkhir, Khalid; David, Jacques; Glémin, Sylvain (2012). Data from: Patterns and evolution of nucleotide landscapes in seed plants [Dataset]. Dryad.


Nucleotide landscapes, which is the way base composition is distributed along a genome, strongly vary among species. The underlying causes of these variations have been much debated. Though mutational bias and selection were initially invoked, GC-biased gene conversion (gBGC), a recombination-associated process favoring the G and C over A and T bases, is increasingly recognized as a major factor. As opposed to vertebrates, evolution of GC content is less well known in plants. Most studies have focused on the GC-poor and homogeneous Arabidopsis genome and the much more GC-rich and heterogeneous rice (Oryza sativa) genome and has often been generalized as a dicot/monocot dichotomy. This vision is clearly phylogenetically biased and does not allow understanding the mechanisms involved in GC-content evolution in plants. To tackle these issues, we used EST data from more than 200 species and provided the most comprehensive description of gene GC content across the seed plant phylogeny so far available. As opposed to the classically assumed dicot/monocot dichotomy, we found continuous variations in GC content from the probably ancestral GC-poor and homogeneous genomes to the more derived GC-rich and highly heterogeneous ones, with several independent enrichment episodes. Our results suggest that gBGC could play a significant role in the evolution of GC content in plant genomes.

Usage notes