Skip to main content
Dryad

Data from: Evolution and biogeography of Memecylon

Cite this dataset

Amarasinghe, Prabha et al. (2022). Data from: Evolution and biogeography of Memecylon [Dataset]. Dryad. https://doi.org/10.5061/dryad.ksn02v72w

Abstract

Premise

The woody plant group Memecylon (Melastomataceae) is a large clade occupying diverse forest habitats in the Old World tropics and exhibiting high regional endemism. Its phylogenetic relationships have been previously studied using ribosomal DNA with extensive sampling from Africa and Madagascar. However, divergence times, biogeography, and character evolution of Memecylon remain uninvestigated. We present a phylogenomic analysis of Memecylon to provide a broad evolutionary perspective of this clade.

Methods

One hundred supercontigs of 67 Memecylon taxa were harvested from target enrichment. The data were subjected to coalescent and concatenated phylogenetic analyses. A timeline was provided for Memecylon evolution using fossils and secondary calibration. The calibrated Memecylon phylogeny was used to elucidate its biogeography and ancestral character states.

Results

Relationships recovered by the phylogenomic analyses are strongly supported in both maximum likelihood and coalescent-based species trees. Memecylon is inferred to have originated in Africa in the Eocene and subsequently dispersed predominantly eastward via long-distance dispersal (LDD), although a reverse dispersal from South Asia westward to the Seychelles was postulated. Morphological data exhibited high levels of homoplasy, but also showed that several vegetative and reproductive characters were phylogenetically informative.

Conclusions

The current distribution of Memecylon appears to be the result of multiple ancestral LDD events. Our results demonstrate the importance of the combined effect of geographic and paleoclimatic factors in shaping the distribution of this group in the Old World tropics. Memecylon includes a number of evolutionarily derived morphological features that contribute to diversity within the clade.

Methods

Silica-dried tissue samples of Memecylon leaves were collected in the field, mainly from Sri Lanka, India, Singapore, Thailand, Philippines, Indonesia, Seychelles, Madagascar, and South Africa. Additional samples of Memecylon and an outgroup, Mouriri, were taken from herbarium specimens. Total genomic DNA was extracted following a modified CTAB extraction protocol (Jantzen et al., 2020). Target-capture was employed to enrich genomic regions of interest for the rest of the samples. All accessions were processed according to a workflow described in detail in Jantzen et al. (2020). Cleaned target-capture reads were assembled with the Burrows-Wheeler alignment version of HybPiper v.1.3.1 using the template sequences used for probe design as references (Johnson et al., 2016). Genome-skims were assembled using a modified HybPiper script. The postprocessing scripts in the HybPiper were used to retrieve intron regions flanking targeted exons and the supercontigs (Johnson et al., 2016). The supercontigs were individually aligned using MAFFT v7.215 (Katoh and Standley, 2013). The alignments were trimmed using trimAl v1.2 (Capella-Gutiérrez et al., 2009). Gene trees were generated for each of the individual gene alignments using maximum-likelihood (ML) analysis performed with RAxML v8.2.12 (Stamatakis, 2014), assuming a GTRGAMMA model. Genes (i.e. 100 supercontigs) were concatenated, and a partition scheme was provided using Phyx v8.2.0 (Brown et al., 2017). ML analysis was performed following the above RAxML parameters for the concatenated dataset. To infer a coalescent species tree, the 100 optimal gene trees generated from RAxML were input to ASTRAL-III v5.0.3 (Zhang et al., 2018). As informative branch lengths are required for downstream analyses, the resulting ASTRAL-III topology was constrained on the ML analysis. A time-calibrated phylogeny was produced using two fossils and two secondary calibrations from Berger et al. (2016)The calibrated Memecylon phylogeny was used to elucidate its biogeography and ancestral character-states.

Usage notes

In the alignment of low copy nuclear loci, the sequences of M. fruticosum M. elaeagni, M. lateriflorum, M. lanceolatum, M. magnifoliatum, and M. symplociforme contained a significant amount of missing data due to low success in target capture for these samples. However, these samples were retained in the analysis, because some of them showed important relationships in the resulting phylogeny. In the morphological data matrix, some data are missing because these specimens were sterile. 

Funding

American Society of Plant Taxonomists

Botanical Society of America

International Association for Plant Taxonomy

University of Florida Biodiversity Institute

Mildred Mason Griffith award, Department of Biology, University of Florida

Davis Graduate Fellowships, Department of Biology, University of Florida

DBT Grant Project, Award: A3: DBT⁄PR12720⁄COE⁄34⁄21⁄2015