Rapid in situ diversification rates in Rhamnaceae explain the parallel evolution of high diversity in temperate biomes from global to local scales
Data files
Jan 16, 2024 version files 120.13 MB
Abstract
The macroevolutionary processes that have shaped biodiversity across the temperate realm remain poorly understood and may have resulted from evolutionary dynamics related to diversification rates, dispersal rates, and colonization times, closely coupled with Cenozoic climate change.
We integrated phylogenomic, environmental ordination, and macroevolutionary analyses for the cosmopolitan angiosperm family Rhamnaceae to disentangle the evolutionary processes that have contributed to high species diversity within and across temperate biomes.
Our results show independent colonization of environmentally similar but geographically separated temperate regions mainly during the Oligocene, consistent with the global expansion of temperate biomes. High global, regional, and local temperate diversity was the result of high in situ diversification rates, rather than high immigration rates or accumulation time, except for Southern China, which was colonized much earlier than other regions. The relatively common lineage dispersals out of temperate hotspots highlights strong source-sink dynamics across the cosmopolitan distribution of Rhamnaceae.
The proliferation of temperate environments since the Oligocene may have provided the ecological opportunity for rapid in situ diversification of Rhamnaceae across the temperate realm. Our study illustrates the importance of high in situ diversification rates for the establishment of modern temperate biomes and biodiversity hotspots across spatial scales.
README: Rapid in-situ diversification rates in Rhamnaceae explain the parallel evolution of high diversity in temperate biomes from global to local scales
https://doi.org/10.5061/dryad.gxd2547sq
Dataset Summary:
The dataset pertains to the research on the "Rapid in situ diversification rates in Rhamnaceae explain the parallel evolution of high diversity in temperate biomes from global to local scales". It encompasses genetic information extracted from 574 Rhamnaceae species, along with three species of Elaeagnaceae and one species each of Barbeyaceae and Dirachmaceae (Rosales).
Experimental Procedures and Results:
- Data Collection: Leaf material was gathered from multiple herbaria and the field. Total DNA extraction was performed via a modified CTAB method.
- Sequencing and Processing: Hybridization enrichment sequencing (Hyb-seq) was used to capture 100 low-copy nuclear genes. Raw sequenced reads underwent cleaning and filtering processes, which included trimming Illumina adapter sequence artifacts, discarding low-quality reads, and trimming low-quality read ends using TRIMMOMATIC v0.32. Assembly of processed nuclear reads was executed using HybPiper v1.2. Each gene missing > 75% of the sampled species was excluded. As a result, 89 loci were kept for further analysis.
- Sequence Alignment and Cleaning: The sequences of 89 genes were aligned using MAFFT, following which the original alignments were cleaned to reduce errors, removing gap-heavy and ambiguously aligned sites. To reduce errors in our alignments (i.e., gap-heavy and ambiguously aligned sites), the original alignment of each gene was cleaned using ‘pxclsq’ in phyx, removing alignment columns with < 30% occupancy.
Description of Data Structure:
The dataset consists of 89 cleaned alignments after processing.
Methods
We sampled 574 Rhamnaceae species, three species of Elaeagnaceae and one species each of Barbeyaceae and Dirachmaceae (Rosales). Leaf material was collected from the following herbaria: A, AD, BRI, CAS, F, KUN, MEL, MO, NY, OS, PERTH, TEX, and US as well as from the field. Total DNA was extracted using a modified CTAB method.
We use hybridization enrichment sequencing (Hyb-seq) to capture 100 low-copy nuclear genes. Raw sequenced reads were cleaned and filtered as follows: Illumina adapter sequence artifacts were trimmed, low-quality reads were discarded, and low-quality read ends were trimmed using TRIMMOMATIC v0.32 . Assembly of the processed nuclear reads was performed using HybPiper v1.2. Each gene missing > 75% of the sampled species was excluded. As a result, 89 loci were kept for further analysis.
The sequences of each targeted gene region were initially aligned using MAFFT using default settings. To reduce errors in our alignments (i.e., gap-heavy and ambiguously aligned sites), we cleaned the original alignment of each gene using ‘pxclsq’ in phyx, removing alignment columns with < 30% occupancy. Finally, we got 89 cleaned alignments.