The asynchronous rise of Northern Hemisphere alpine floras reveals general responses of biotic assembly to orogeny and climate change
Data files
Dec 03, 2025 version files 3.08 GB
-
Biogeographic_analysis.zip
3.07 GB
-
Connectivity_analysis.zip
5.49 MB
-
GenBank_accession.zip
145.94 KB
-
Plots.zip
20.66 KB
-
README.md
10.65 KB
Abstract
Understanding how biotic assembly processes responded to past geo-climatic changes is key to explaining the origins of mountain biodiversity and the causes of regional disparities in species richness. Here, we jointly reconstructed geographic ranges and biome-niche evolution for 34 diverse plant clades across five major Northern Hemisphere mountain systems and quantified how late Neogene cooling increased arctic-alpine habitat connections across regions. We reveal that, while alpine floras originated asynchronously and were assembled through distinct evolutionary processes over the past 30 Ma, general biological responses to orogeny and environmental change are apparent. Across regions, in situ diversification was consistently elevated during heightened phases of tectonic activity. Over the past 5 Ma, enhanced arctic–alpine connectivity facilitated biotic interchange and positioned the boreal–arctic region as a major biogeographic crossroads linking Eurasia and North America.
Dataset DOI: 10.5061/dryad.8931zcs30
Description of the data and file structure
The data archive comprises complementary components, including
- Curated GenBank accession tables organized by clade and gene region;
- Joint biogeographic and diversification analyses (geographic‐range & biome delimitations, RevBayes scripts);
- Cold-habitat connectivity assessments (paleoclimate-informed dispersal matrices and global connectivity metrics).
Files and variables
File: Biogeographic_analysis.zip
Description: This folder contains the input data (delimitations of geographical range and biome, time-calibrated phylogenies), RevBayes scripts for the biogeographic model that jointly estimates the range and biome evolution, outputs from ancestral biogeographic reconstructions and bootstrap replicates.
The directory tree structure is visible bellow:
├── events-bootstrap1000.csv.gz
├── MaxRange3
│ ├── 2_parse-tensor-json.py
│ ├── 3_bootstrap_parse.py
│ ├── area-adjacency.csv
│ ├── bitstates.py
│ ├── clades
│ │ ├── Alchemilla
│ │ │ ├── Alchemilla.smap.log
│ │ │ └── inputs
│ │ │ ├── 01_make-revscript-tensor.py
│ │ │ ├── Alchemilla.nex
│ │ │ ├── Alchemilla_classe_tensor.rev
│ │ │ ├── area-adjacency.csv
│ │ │ ├── biorange.csv
│ │ │ ├── bitstates.py
│ │ │ ├── classe-tensor-template.rev
│ │ │ └── model4.py
│ │ ├── Allium
│ │ │ ├── Allium.smap.log
│ │ │ └── inputs
│ │ │ ├── 01_make-revscript-tensor.py
│ │ │ ├── Allium.nex
│ │ │ ├── Allium_classe_tensor.rev
│ │ │ ├── area-adjacency.csv
│ │ │ ├── biorange.csv
│ │ │ ├── bitstates.py
│ │ │ ├── classe-tensor-template.rev
│ │ │ └── model4.py
│ │ ├── ...
├── MaxRange4
│ ├── 2_parse-tensor-json.py
│ ├── 3_bootstrap_parse.py
│ ├── area-adjacency.csv
│ ├── bitstates.py
│ ├── clades
│ │ ├── Artemisia
│ │ │ ├── Artemisia.smap.log
│ │ │ └── inputs
│ │ │ ├── 01_make-revscript-tensor.py
│ │ │ ├── Artemisia.nex
│ │ │ ├── Artemisia_classe_tensor.rev
│ │ │ ├── area-adjacency.csv
│ │ │ ├── biorange.csv
│ │ │ ├── bitstates.py
│ │ │ ├── classe-tensor-template.rev
│ │ │ └── model4.py
│ │ └── Stipa
│ │ ├── Stipa.smap.log
│ │ └── inputs
│ │ ├── 01_make-revscript-tensor.py
│ │ ├── Stipa.nex
│ │ ├── Stipa_classe_tensor.rev
│ │ ├── area-adjacency.csv
│ │ ├── biorange.csv
│ │ ├── bitstates.py
│ │ ├── classe-tensor-template.rev
│ │ └── model4.py
│ └── model4.py
├── p20
│ ├── events-bootstrap1000_p20.csv.gz
│ └── selected_clades
│ ├── Campanula
│ │ ├── 01_make-revscript-tensor.py
│ │ ├── Campanula.nex
│ │ ├── Campanula_classe_tensor.rev
│ │ ├── area-adjacency.csv
│ │ ├── biorange.csv
│ │ ├── bitstates.py
│ │ ├── classe-tensor-template.rev
│ │ └── model4.py
│ └── Saxifraga
│ ├── 01_make-revscript-tensor.py
│ ├── Saxifraga.nex
│ ├── Saxifraga_classe_tensor.rev
│ ├── area-adjacency.csv
│ ├── biorange.csv
│ ├── bitstates.py
│ ├── classe-tensor-template.rev
│ └── model4.py
└── p50
├── events-bootstrap1000_p50.csv.gz
└── selected_clades
├── Campanula
│ ├── 01_make-revscript-tensor.py
│ ├── Campanula.nex
│ ├── Campanula_classe_tensor.rev
│ ├── area-adjacency.csv
│ ├── biorange.csv
│ ├── bitstates.py
│ ├── classe-tensor-template.rev
│ └── model4.py
└── Saxifraga
├── 01_make-revscript-tensor.py
├── Saxifraga.nex
├── Saxifraga_classe_tensor.rev
├── area-adjacency.csv
├── biorange.csv
├── bitstates.py
├── classe-tensor-template.rev
└── model4.py
events-bootstrap1000.csv.gz: Compressed table contains all reconstructed biogeographic events from 1,000 stochastic mappings across 34 plant clades, including time, range states, event type, and region-level binary encodings needed to reproduce all event-based analyses in the manuscript.
These subdirectories (MaxRange3/, MaxRange4/, p20/, p50/) represent different settings for the maximum biogeographic occupancy, used to ensure the computational feasibility of the tensor-based biogeographic model. Each subdirectory contains the input clades and the associated scripts used for parsing, running, and summarizing the analyses.
Each subdirectory includes:
clades/folder containing the output (stochastic mapping*.smap.logfile) and the required inputs for each clade:- the dated phylogenetic tree (
*.nex) - the range–biome binary matrix (
biorange.csv) - the
.revscript automatically generated by01_make-revscript-tensor.py
- the dated phylogenetic tree (
- Supporting files required to run the RevBayes analyses:
- template script
classe-tensor-template.rev - custom modules
bitstates.pyandmodel4.py - adjacency matrix of geographic regions (
area-adjacency.csv)
- template script
Scripts for parsing and bootstrap replication
Each subdirectory also includes the scripts used to parse and process tensor outputs:
2_parse-tensor-json.py
Parses the RevBayes tensor JSON outputs into standardized event tables.3_bootstrap_parse.py
Performs random sampling to obtain 1,000 replicated biogeographic histories, generating bootstrap summaries of events across clades under each range-size constraint.
Sensitivity analysis subdirectories: p20/ and p50/
The p20/ and p50/ folders follow the same structure described above but contain analyses in which Campanula and Saxifraga were subsampled to 50% or 20% of their total species diversity.
These reduced datasets were used for sensitivity analyses to evaluate the robustness of biogeographic inferences to sampling fraction.
Each of these folders includes the corresponding compressed tables of reconstructed biogeographic events from 1,000 stochastic mappings under the reduced sampling schemes.
File: Connectivity_analysis.zip
Description: This directory contains paleoclimate-based connectivity layers generated at 1-million-year intervals from 0 to 35 Ma, used to quantify long-term changes in the structural connectedness of alpine and cold-adapted habitats across the Northern Hemisphere.
Each time slice contains four types of files:
*_temp.tif— reprojected paleotemperature raster (annual mean temperature)*_alpine_biome.tif— binary raster of alpine biome pixels*_cost_250-links.*— Graphab output files describing least-cost links between alpine-biome patches (shapefile + CSV adjacency tables)*_quickplot.png— quick-look diagnostic map for visual inspection
The final file, connectivity_over_time.rds, stores the probability connectivity (PC) calculated for all 35 million-year intervals.
File: Plots.zip
This directory contains scripts and intermediate processed files used to generate the figures included in the main text. These scripts read the biogeographic event tables, summarize temporal dynamics, and produce rate-through-time curves, cumulative dispersal plots, and event-type visualizations.
All Python scripts were run using Python 3.9, with required packages listed at the top of each script. The R script was run using R 4.4, with required packages loaded within the script. These package imports provide the information needed to reproduce the analyses; no additional environment files are required.
- Fig1_plot_nestedPieChat.py
Generates the nested pie-chart visualizations summarizing the relative contributions of different biogeographic event types across mountain systems (Figure 1). - Fig2B_rate_emigration.py
Computes emigration (out-of-region) rates through time for all clades and produces the rate-through-time curves used in Figure 2B. - Fig2B_rate_immigration.py
Computes immigration (into-region) rates through time and generates the complementary rate-through-time curves for Figure 2B. - Fig2C_dispersal_cumsum_A.csv
Pre-processed summary table of dispersal events used for cumulative dispersal calculations in Figure 2C. Range and biome abbreviations shown in columnmatric,fromandtofollow the definitions provided in the main text and Supplementary Materials. - Fig2C_dispersal_cumsum.py
Calculates cumulative dispersal events over time for each region and generates the csv fileFig2C_dispersal_cumsum_A.csv. - Fig2C_sumBarchart.R
R script for summarizing and visualizing net immigration or emigration as bar charts, providing the categorical summaries for Figure 2C. - Fig3_insitu_rate_all.py
Estimates in situ speciation rates across mountain regions, generating the rate-through-time curves displayed in Figure 3.
File: GenBank_accession.zip
Description: This directory contains all curated GenBank accession numbers for the molecular sequences used in the phylogenetic reconstruction for this study. For each clade, the most commonly sequenced gene regions (e.g., ITS, matK, rbcL, petD, trnL–trnF) were selected to maximize taxonomic coverage.
Each .csv file corresponds to one major clade or lineage included in the analyses. Columns represent the species names and the accession numbers for each gene region, with missing sequences indicated by “–”.
