Data from: Diversification of R2R3-MYB transcription factors in the tomato family Solanaceae
Gates, Daniel J. et al. (2016), Data from: Diversification of R2R3-MYB transcription factors in the tomato family Solanaceae, Dryad, Dataset, https://doi.org/10.5061/dryad.d63t5
MYB transcription factors play an important role in regulating key plant developmental processes involving defense, cell shape, pigmentation, and root formation. Within this gene family, sequences containing an R2R3 MYB domain are the most abundant type and exhibit a wide diversity of functions. In this study, we identify 559 R2R3 MYB genes using whole genome data from four species of Solanaceae and reconstruct their evolutionary relationships. We compare the Solanaceae R2R3 MYBs to the well-characterized Arabidopsis thaliana sequences to estimate functional diversity and to identify gains and losses of MYB clades in the Solanaceae. We identify numerous R2R3 MYBs that do not appear closely related to Arabidopsis MYBs, and thus may represent clades of genes that have been lost along the Arabidopsis lineage or gained after the divergence of Rosid and Asterid lineages. Despite differences in the distribution of R2R3 MYBs across functional subgroups and species, the overall size of the R2R3 subfamily has changed relatively little over the roughly 50 million-year history of Solanaceae. We added our information regarding R2R3 MYBs in Solanaceae to other data and performed a meta-analysis to trace the evolution of subfamily size across land plants. The results reveal many shifts in the number of R2R3 genes, including a 54 % increase along the angiosperm stem lineage. The variation in R2R3 subfamily size across land plants is weakly positively correlated with genome size and strongly positively correlated with total number of genes. The retention of such a large number of R2R3 copies over long evolutionary time periods suggests that they have acquired new functions and been maintained by selection. Discovering the nature of this functional diversity will require integrating forward and reverse genetic approaches on an -omics scale.