Imperiled wanderlust lichens in steppe habitats of western North America comprise geographically structured mycobiont lineages and a reversal to sexual reproduction within this asexual clade - Supplementary material
Data files
Nov 14, 2024 version files 1.90 GB
-
README.md
3.28 KB
-
S1_sample_locations_15oct2024.xlsx
23.59 KB
-
S2_idahoensis-rr3719type-SPAdes-scaffolds.fasta
260.33 MB
-
S3_individual-gene-alignment-v2.zip
10.69 MB
-
S4_HybPiper-Heatmap.pdf
439.76 KB
-
S5_busco_hybpiper_gblocks_individual_1615gene_trees.tre
6.91 MB
-
S6_BUSCO_concatenated-gblocks-supermatrix-v2.fasta
324.17 MB
-
S7_19mb_realyphy_50kbplusREF_alignment.phy
1.30 GB
-
S8_IQtree_BUSCOconcatenated_gblocks_supermatrix.txt
10.64 KB
-
S9_BUSCOastral_gblocks_output.txt
12.16 KB
-
Supplementary_file_S10_r1.pdf
833.53 KB
Abstract
The northern North American Cordillera is a globally significant center of endemism. In western North America, imperiled arid steppe habitats support a number of unique species, including several endemic lichens. However, processes driving diversification and endemism in this region remain unclear. In this study, we investigate diversity and phylogeography of the threatened wanderlust lichens (mycobiont = Rhizoplaca species) which occur unattached on calcareous soils in steppe habitats in western North America. Wanderlust lichens comprise three species of lichen-forming fungi (LFF) – Rhizoplaca arbuscula, R. haydenii, and R. idahoensis (endangered, IUCN Red List) – which occur in fragmented populations in Idaho and Wyoming, with more limited populations in southeastern Montana and northern Utah. These lichens reproduce almost exclusively via large, asexual vegetative propagules. Here, our aims were to (i) assess the evolutionary origin of this group and identify phylogeographic structure, (ii) infer ancestral geographic distributions for lineages within this clade, and (iii) use species distribution modeling to better understand the distribution of contemporary populations. Using a genome-skimming approach, we generated a 19.1 Mb alignment, spanning ca. half of the complete LFF genome, from specimens collected throughout the entire range of wanderlust lichens. Based on this phylogeny we investigated phylogeographic patterns using RASP. Finally, we used MaxEnt to estimate species distribution models for R. arbuscula and R. haydenii. We inferred a highly structured topology, with clades corresponding to distinct geographic regions and morphologies represented throughout the group’s distribution. We found that R. robusta, a sexually reproducing taxon, is clearly nested within this asexual lineage. Phylogeographic analyses suggest that both dispersal and vicariance played a significant role throughout the evolutionary history of the vagrant Rhizoplacaclade, with most of the dispersal events originating from the Salmon Basin in eastern Idaho – the center of diversity for this group. Despite the fact that wanderlust lichens are dispersal limited due to large, unspecialized vegetative propagules, we inferred multiple dispersal events crossing the Continental Divide. Comparing herbarium records with SDMs suggests that wanderlust lichens don’t fully occupy the areas of highest distribution probability. In fact, documented records often occur in areas predicted to be only marginally suitable. These data suggest a potential mismatch between contemporary habitats outside of the center of diversity in eastern Idaho with the most suitable habitat, adding to the vulnerability of this imperiled complex of endemic lichens.
https://doi.org/10.5061/dryad.g1jwstr1g
Files
File: S1_sample_locations_15oct2024.xlsx
Description: Information on “wanderlust lichens” included in this study. Taxonomic determinations; DNA and clade membership codes used in Figure 2; hydrological units at basin level (HEX_HU6), subbasins (WBD_HU8) and watersheds (WBD_HU10); locality information, latitude and longitude; voucher number for each specimen and herbarium; and the sequence read archive number.
File: S2_idahoensis-rr3719type-SPAdes-scaffolds.fasta
Description: Complete metagenomic assembly of R. idahoensis (Rosentreter 3719 [isotype, BRY-C], assembled using SPAdes v3.14.1.
File: S3_individual-gene-alignment-v2.zip
Description: Multiple sequences alignments of the 1,625 single-copy BUSCO genes representing the Rhizoplaca melanopthathalma group.
File: S4_HybPiper-Heatmap.pdf
Description: Heat map showing recovery efficiency for 1625 complete, single-copy BUSCO markers in the Rhizoplaca melanopthathalma group by HybPiper. 1615 BUSCO markers passed quality filtering and were used in downstream analyses.
File: S5_busco_hybpiper_gblocks_individual_1615gene_trees.tre
Description: Maximum likelihood (ML) phylogenetic reconstructions of 1,615 individual complete, single-copy BUSCO MSAs, inferred using IQ-TREE 2 (Minh et al., 2020). Ambiguously aligned regions in the individual BUSCO MSAs were removed using GBlocks (Talavera and Castresana, 2007), implementing default parameters.
File: S6_BUSCO_concatenated-gblocks-supermatrix-v2.fasta
Description: The concatenated alignment comprising 1,615 individual complete, single-copy BUSCO MSAs representing the mycobiont of “wanderlust lichens”, with ambiguously aligned regions removed using the default parameters in GBlocks (Talavera and Castresana, 2007).
File: S7_19mb_realyphy_50kbplusREF_alignment.phy
Description: The “REALPHY” multiple sequence alignment spanning over 19 Mb of the Rhizoplaca nuclear genome representing the mycobiont of “wanderlust lichens”. The “REALPHY” alignment was generated by mapping short reads to all contigs longer than 50 Kb assembled in SPAdes v3.14.1 and inferred to have similar coverage to BUSCO regions using RealPhy v1.12 (Bertels et al., 2014).
File: S8_IQtree_BUSCOconcatenated_gblocks_supermatrix.txt
Description: Maximum likelihood (ML) phylogenetic reconstruction representing the mycobiont of “wanderlust lichens” and inferred from the concatenated 1615 BUSCO markers – see supplementary file S6.
File: S9_BUSCOastral_gblocks_output.txt
Description: The ASTRAL species tree representing the mycobiont of “wanderlust lichens” and inferred from 1,615 individual BUSCO gene trees – see supplementary file S5.
File: Supplementary_file_S10_r1.pdf
Description: Optimal regularization multiplier and feature class combinations, variable importance rankings, and variable response curves for MaxEnt models created from the nine preselected environmental variables as well as those created using the data-driven process to select environmental variables.
##
2.1. Taxon sampling
Our study focused on the monophyletic Rhizoplaca haydenii/R. idahoensis + R. arbusculaclade sensu Leavitt et al. (2019) – the wanderlust clade – in the Rhizoplaca melanophthalma species complex. Vagrant members of this clade (R. arbuscula, R. haydenii, and R. idahoensis) are restricted to steppe habitats in western North America (in dry valleys, high plains, and occasionally above tree line (Rosentreter, 1993). Recently, a fertile, saxicolous species – R. robusta – was described from alpine steppe habitat on the Beartooth Plateau in southern Montana/northern Wyoming, and its overall distribution includes only a single other known population in central Montana (Jones et al., 2022, Keuler et al., 2020). We note that an earlier study reported R. haydenii from the Tianshan Mountains in China (Zheng et al., 2007) – beyond the currently known distribution in western North America. However, the specimen was not available for loan, and we were unable to confirm the identification of this specimen. Based on comparisons of sequence similarity of the internal transcribed spacer region (ITS) and exploratory phylogenetic analyses, this specimen was not recovered within the wanderlust lichen clade. We selected 64 specimens representing populations throughout the known distribution of members of the wanderlust clade for high-throughput sequencing (supplementary file S1; Fig. 1); and our sampling is broadly representative of available herbarium records. Our sampling included all formally described species within this lineage – R. arbuscula(n = 29), R. haydenii (21), R. idahoensis (2), and R. robusta (12). Determinations were made following (McCune and Rosentreter, 2007), and for R. robusta, we followed (Jones et al., 2022). We compiled Illumina short read data from previous studies representing 70 specimens – 41 representing outgroup taxa and 29 representing members of the wanderlust clade (Leavitt et al., 2016, Leavitt et al., 2019). For this study, 37 additional specimens representing members of the wanderlust clade from diverse populations were selected for short-read shotgun sequencing and subsequent genome skimming (Zhang et al., 2022). All specimens are housed at the Herbarium of Non-Vascular Cryptogams at the M.L. Bean Life Science Museum at Brigham Young University (BRY-C), Provo, Utah, USA. Rhizoplaca arbuscula and R. haydenii co-occurred at multiple sampled sites including at the vicinity of Mud Creek in southwest Idaho (Owyhee Co.); sites near the Lost River and Lemhi ranges in east-central Idaho; east of the Monte Cristo Range in northern Utah (Rich Co.); sites in the rolling sagebrush steppe ecoregion in central Wyoming (Carbon Co.); and on the Beartooth Plateau, Montana (Carbon Co.). For each sample, the coinciding hydrologic subbasin was determined using shapefiles of the 6th level hydrologic units from the United States Geologic Survey (Berelson et al., 2004). Members of the R. melanophthalma complex were selected as the outgroups, with the R. portericomplex identified as sister to the wanderlust clade based on results of previous studies (Leavitt et al., 2016, Leavitt et al., 2019).
2.2. DNA extraction and sequencing
For newly collected specimens, total genomic DNA was extracted from lichen thalli using the E.Z.N.A. Plant DNA DS Mini Kit (Omega Bio-Tek, Inc., USA) and following the manufacturer’s protocol. We prepared total genomic DNA following the standard Illumina whole genome sequencing (WGS) library preparation process with Adaptive Focused Acoustics for shearing (Covaris), followed by an AMPure cleanup process. The DNA was then processed with the NEBNext Ultra™ II End Repair/dA-Tailing Module end-repair, together with the NEBNext Ultra™ II Ligation Module (New England Biolabs), while using standard Illumina index primers. Libraries were pooled and sequenced using the HiSeq 2500 sequencer in high output mode, with 125 cycle paired-end (PE) reads at the DNA Sequencing Center at Brigham Young University, Provo, Utah, USA.
2.3. Assembling phylogenomic datasets
The use of large-scale data matrices in phylogenetic analyses have the potential to provide unprecedented resolution for longstanding evolutionary questions, although different phylogenomic datasets may results in conflicting topologies (Reddy et al., 2017). To help resolve relationships among the closely-related and recently diverged members of the wanderlust clade, we prepared two nuclear genome-scale datasets for phylogenomic analyses – one comprising 1615 Benchmarking Universal Single-Copy Orthologs (BUSCO; (Simão et al., 2015) from the nuclear genome and the second spanning over 19 megabase pairs (Mb) of the nuclear genome – ca. half of the estimated 38.68 Mb genome of a close relative, R. melanophthalma (Leavitt et al., 2016). These two datasets facilitated multiple approaches to reconstruct relationships under a range of distinct assumptions. For the “BUSCO targets”, we opted to generate a draft genome de novo for a member of the wanderlust clade – R. idahoensis (Rosentreter 3719 [isotype, BRY-C]), rather than using the currently available draft metagenome from R. melanophthalma (Leavitt et al., 2016). Rhizoplaca idahoensis was recently assessed as endangered according to Red List criteria (Root et al., 2021), and the value of a draft metagenome assembly will likely extend beyond this study. For R. idahoensis (Rosentreter 3719), paired-end reads were assembled using SPAdes v3.14.1 (Bankevich et al., 2012), with default parameters and assigned k-mer values at 21, 33, 55, and 77. The resulting metagenomic contigs were not annotated, although single-copy nuclear markers from the mycobiont were subsequently inferred.
contigsMetagenomic scontigs from the SPAdes assembly were assessed to identify mycobiont single-copy nuclear genes for downstream phylogenomic reconstructions using the BUSCO v5.2.2 (Manni et al., 2021) and the “ascomycota_odb10” dataset for comparison. Mycobiont BUSCO genes were extracted from a draft assembly of R. idahoensis assembled for this study. BUSCO markers passing the quality filters were used as targets in the HybPiper v1.2 pipeline (Johnson et al., 2016), implementing the “reads_first.py” function. We drew the HybPiper results with the get_seq_lengths.py function and visualized the coverage heatmap using R 4.1.2 (Team, 2012) with the geom_raster function, from the ggplot2 package. Genes with less than 50 % coverage were excluded. The remaining contigs from all BUSCO genes across all samples were assembled using the “retrieve_sequences.py” function, and multiple sequence alignments (MSAs) were generated for individual BUSCO regions with MAFFT v7 (Katoh and Toh, 2008, Rozewicki et al., 2017), implementing the default parameters. Ambiguously aligned regions in the individual BUSCO MSAs were removed using GBlocks (Talavera and Castresana, 2007), implementing default parameters. These analyses were performed at the supercomputational facility of Faculty of Science, Kasetsart University (SciKU HPC) in Bangkok, Thailand.
For the second nuclear phylogenomic dataset, short reads were mapped contigs to contigs representing single copy gene regions > 20 kb (506 contigs – total length of scaffold targets spanned over 25.6 Mb.) from a previously published draft genome from R. melanophthalma (Leavitt et al., 2016). We then used RealPhy v1.12 (Bertels et al., 2014) to align short reads from all samples to the contigs to generate an alignment implementing the following parameters: −readLength 100 −perBaseCov 5 −gapThreshold 0.2.
For the “BUSCO” dataset, maximum likelihood (ML) phylogenetic reconstructions of individual BUSCO MSAs were inferred using IQ-TREE 2 (Minh et al., 2020), with 1,000 ultrafast bootstrap replications. The resulting individual gene trees were analyzed to estimate a species tree using the coalescent-based summary method ASTRAL v5.7.8 (Mirarab and Warnow, 2015). Subsequently, we used FASconCAT-G v1.05 (Kück and Longo, 2014) to concatenate all BUSCO alignments into a single supermatrix (Tonini et al., 2015). A ML topology was inferred from this supermatrix using IQ-TREE 2.1.3, with the ‘-p’ option (Minh et al., 2020) and analyzed with 1000 ultrafast bootstraps and an edge-linked proportional partition model to account for different evolutionary rates between partitions (Chernomor et al., 2016). The substitution rate of each locus was independently modeled by ModelFinder (−m MFP) (Kalyaanamoorthy et al., 2017). The rest of IQ-tree parameters were left as default.
For the “RealPhy” dataset, exploratory ML topologies were inferred using IQ-TREE 2 (Minh et al., 2020), with 1,000 ultrafast bootstrap replicates under a variety of substitution models and simple partitioning strategies. Topologies and nodal support values were congruent across analyses, and the final ML topology was inferred using a GTR + I + G substitution model without partitions, with 1,000 ultrafast bootstrap replicates. To place the evolutionary history within a relative temporal context, the resulting ML tree was converted to an ultrametric tree with a clock model using the chronos function from the ape package in R (Paradis and Schliep, 2018). We note, that while previous studies have placed the origin and diversification for members of the wanderlust clade in the Pleistocene (Leavitt et al., 2013), this estimate remains largely speculative, and we opted to consider diversification only within a relative time scale, rather than unsettled divergence estimates (Graur and Martin, 2004).
2.4. Ancestral range estimation
To infer the phylogeographic history of the wanderlust clade, we used Reconstruct Ancestral State Phylogenies (RASP) (Yu et al., 2019, Yu et al., 2015). Ancestral range estimation in RASP is meant for use on populations or species, rather than individual specimens. Therefore, we collapsed the phylogeny of individual samples inferred from the 19 Mb phylogenomic dataset – see Results – into 18 subclades, each representing a “population”, or lineage of closely related specimens in the wanderlust clade. The phylogeny was collapsed based on the following criteria: first, we combined tips with extremely short branch lengths. Next, we collapsed any remaining clades that have distinct morphology. We did not use population genetics to collapse clades due to two main concerns: First, our phylogeny contains both sexual and asexual lineages. Reproductive mode can have a great effect on gene flow and population structure, so we do not expect the same criteria for defining populations to apply to each lineage. Second, several lineages have distinct and interesting morphology. We chose some populations that reflect morphological clades to specifically investigate the biogeography of these lineages. We ensured that each population was monophyletic. Thus, if two separate clades contained similar morphology and would be non-monophyletic when combined, they were left as two separate populations. For each “population”, its current range was defined as the combination of all hydrologic subbasins where its constituents were sampled. For each “population”, its current range was defined as the combination of all hydrologic subbasins where its constituents were sampled.
Using this phylogeny of 18 subclades, we estimated the ancestral range of members of the wanderlust clade at each node and the number of dispersal events between basins. Due to the large size of the vegetative propagules for vagrant Rhizoplaca taxa (Rosentreter, 1993), we assumed that divides separating major drainages likely provide potential barriers to dispersal because of the relatively large size of vegetative propagules. We tested several models — DEC, DIVALIKE, and BAYAREALIKE — plus their corresponding + J alternatives. Critics of the alternative + J models point out potential statistical problems with comparing alternative + J models to their null counterparts (Ree and Sanmartin, 2018). However, these criticisms fall short, as there is contrasting evidence that these comparisons are indeed statistically valid (Matzke, 2022). We do not attempt to settle this debate about the statistical validity of comparing null models with their + J alternatives. However, we include the alternative + J models in our model selection because we suspect jump dispersal to have played a prominent role in the diversification of vagrant Rhizoplaca. The best fitting model (BAYAREALIKE + J) was selected using the weighted Akaike Information Criteria (AICwt) and was used for estimating ancestral ranges and dispersal events.
2.5. Niche modeling
Occurrence Records and Environmental Layers. We investigated the environmental tolerances and niches for members of the wanderlust clade in environmental and geographic space. Occurrence data for vagrant Rhizoplaca species were downloaded from the Consortium of North American Lichen Herbaria (CNALH; https://lichenportal.org/; accessed 30 October 2022). Specimens with coordinate data and that were collected from 1970 and beyond were retained and, where possible, assigned to either R. arbuscula or R. haydenii. Due to recent taxonomic revisions, specimens identified as R. haydenii ssp. arbuscula in CNALH were assigned to R. arbuscula, recently elevated to species (Leavitt et al., 2019), and specimens identified as R. haydenii spp. haydenii in CNALH were assigned to R. haydenii in preparation for investigating how their niches compare in environmental and geographic space. Additionally, all remaining occurrences, including those of R. idahoensis, were combined with the R. arbuscula and R. haydenii occurrences to infer a composite distribution model of all vagrant members of the wanderlust clade. Our initial datasets consisted of 72 confirmed occurrence points for R. arbuscula, 25 for R. haydenii, and 259 for all vagrant Rhizoplaca (R. arbuscula, R. haydenii, and R. idahoensis combined). Potentially suitable habitats for wanderlust lichen populations in the Intermountain West have been well sampled over the past decades (R. Rosentreter, personal communication), including intense collection efforts by the author SDL. The lack of recorded populations of R. arbuscula and R. haydenii in the states of Colorado, Nevada, and Utah (apart from populations in northeastern Utah) is most likely indicative that these species do not occur in these states, or occur in very small, unobserved, isolated populations (SDL, personal observation). Thus, we assume that the current herbarium records largely reflect the actual distribution of vagrant Rhizoplaca species with little sampling bias. Thus, to account for any small amount of sampling bias present in these occurrence data, we then used the thinData function in the ‘SDMtune’ R package (Vignali et al., 2020) to thin our input data to one occurrence per 30 arc-second raster cell – matching the resolution of our environmental variables. This resulted in 42 unique occurrences for R. arbuscula, 17 for R. haydenii and 116 for all vagrant Rhizoplaca used in our analyses.
The region of interest was defined as the states of Nevada, Idaho, Wyoming, Montana, Utah, and Colorado as members of the wanderlust clade are unlikely to occur beyond this geographic region, based on previous sampling efforts. For the environmental layers, we acquired and trimmed the bioclimatic variables and elevation datasets from WorldClim (Fick and Hijmans, 2017) and The Global Aridity and Potential Evapotranspiration (Zomer et al., 2022) datasets, all at a resolution of 30 arc-seconds. To account for correlation between these environmental layers, we calculated the pairwise Pearson correlation coefficient between all layers and used the function findCorrelation in the caret R package (Kuhn 2008) to select a representative variable from groups of highly correlated layers (0.70 or greater). These representative layers were selected by removing layers from pairwise comparisons with the largest mean absolute correlations ensuring the remaining environmental layers were as independent as possible. From this, elevation, mean diurnal range (Bio2), isothermality (Bio3), temperature seasonality (Bio4), min temperature of coldest month (Bio6), mean temperature of wettest quarter (Bio8), mean temperature of driest quarter (Bio9), precipitation of wettest month (Bio13), and precipitation seasonality (Bio15), were retained for further analysis. To help characterize the environment in our analyses we also prepared 10,000 background points randomly distributed across our study area using the CRAN package ‘dismo’ v 1.9 (Hijmans et al., 2022) in preparation for modeling.
Environmental Space. To compare niches in environmental space, we first analyzed the niches of the R. arbuscula and R. haydenii clades using the PCA-env approach (Broennimann et al., 2012) in the R ecospat package (Di Cola et al., 2017). Considering our entire study area for the environmental space of both species, we calibrated a PCA on the nine preselected environmental variables using their values from the 10,000 background points. For both species, we then projected the PCA scores for their occurrences from the first two principal components onto a 100 x 100 PCA grid bounded by the minimum and maximum background PCA scores. A kernel density function was then applied to this grid to estimate smoothed occurrence densities. We then calculated the niche overlap between the species, measured by Schoener’s D (Schoener, 1968) and Hellinger’s I(Hellinger, 1909) and tested for niche equivalency (Broennimann et al., 2012, Warren et al., 2008). This compares the niche overlap values from our actual data and PCA methods to a null distribution of 1000 niche overlap values derived from randomizing the occurrences between the two species, while maintaining their respective sample sizes, and calculating the niche overlap values for each permutation in the same manner. Doing this, we tested for both niche divergence, i.e., if the niche overlap is less equivalent/similar than expected by chance, and niche conservatism, i.e., if the niche overlap is more equivalent/similar than expected by chance. A significant p-value for either test was < 0.05 and was the proportion of randomized niche overlap values lower than the actual niche overlap in the case of the divergence test and higher than the actual niche overlap in the conservatism test. Lastly, to further characterize the differences and similarities between the niches of the two species, we calculated the additional measures of overlap provided by the ecospat package of unfilling, expansion, and stability (Di Cola et al., 2017). In this case, unfilling meant the proportion of the R. arbuscula niche not occupied by R. haydenii and expansion vice versa. Related, stability is the proportion of the niche occupied by the other species, and we calculated this for both species.
Geographic Space and SDM Model Tuning. We investigated the geographic niche space of the clades using species distribution models (SDMs). To create our SDMs for R. arbuscula, R. haydenii, and a composite SDM for all vagrant taxa, we used the MaxEnt algorithm implemented through the ‘SDMtune’ and ‘maxnet’ R packages (Phillips et al., 2017, Phillips et al., 2006, Vignali et al., 2020). We choose MaxEnt due to its wide range of customizability (Elith et al., 2006, Merow et al., 2013), high predictive accuracy compared to other modeling methods, even ensemble models (Elith et al., 2006, Kaky et al., 2020), and is robust with small sample sizes (Pearson et al., 2007). To optimize model complexity and performance, we used the SDMtune R package (Vignali et al., 2020) to select the optimal values for hyperparameters for the SDMs. For this, each of the three occurrence datasets were divided into folds. The R. arbuscula and combined vagrant taxa were divided into geographically structured folds to account for any effects from spatial autocorrelation using ‘ENMeval’ (Muscarella et al., 2014) and the get.block function. With only 17 localities, the R. haydenii dataset was divided into jackknife folds in which folds are composed of a single locality point and one exists for each locality. Cross validation was then used for model tuning and evaluation using the area under the ROC curve (AUC) as our metric for evaluating model performance and model selection. AUC scores range from 0 to 1 with 0.50 being no better than random, 0.70–0.80 as an adequate model, 0.80–0.90 as a good model, and 0.90–1 as an excellent model (Swets, 1988).
We tuned the using the iterative genetic algorithm within the optimizeModel function and exploried the configurations of six feature class combinations, namely “l”, “lq”, “lh”, “lp”, “lqp”, and “lqph”, as well as values for the regularization multiplier ranging from 0.2 to 5, with increments of 0.2. For this, we started with an initial random population of 20 models and repeated the algorithm 5 times, i.e., 5 generations. To avoid getting caught in local optima, we not only kept 40 % of the best models at each iteration, but also retained 20 % of suboptimal models and set the probability of mutation to 40 % to introduce variation through previously unused hyperparameters. Final models were then trained using their optimal hyperparameter configurations, final AUC scores were calculated, and these models were used to predict the probability of the species being present across our study area using a cloglog output where occurrence probability ranges from 0 to 1. These final predictions were then visualized with state boundaries in QGIS (QGIS.org 2024).
Niche Overlap, Variable Importance, and Response Curves. The composite model for all vagrant taxa was created to only visualize the potential distribution and suitable habitat for vagrant Rhizoplaca lichens. We compared it to the predicted distributions of R. arbuscula and R. haydenii, however, we further explored the niches and environmental preferences of the two species. We first calculated the niche overlap of the SDM predictions made using the nine preselected environmental variables for the R. arbuscula, R. haydenii, and vagrant datasets measured by Schoener’s D (Schoener, 1968) and Hellinger’s I (Hellinger, 1909), using calc.niche.overlap in ‘ENMeval’ (Muscarella et al., 2014). The niche overlap values here were high and as we tested for niche equivalency in environmental space, thus we opted instead to explore the variable importance and response curves for these R. arbuscula and R. haydenii models to gain insight into the environmental variables influencing where these lichens occur. We calculated the contribution of each of the nine environmental variables used for fitting these models measured by their mean permutation importance using the SDMtune varImp function with 10 permutations (Vignali et al., 2020). We then created response curves for each variable. These represent each a model created using only one of the variables and reflect the dependence of the predicted suitable habitat with that variable. To further explore the environmental preferences of these two species, we used the SDMtune varSel function (Vignali et al., 2020) to create additional SDMs in the same manner as described above, but using a data-driven process to select the environmental variables used for each model. This similarly ensures that the environmental variables used within the models are not highly correlated but that the optimal combination of them are chosen to represent the data. Additionally, we reduced model complexity using the SDMtune reduceVar function (Vignali et al., 2020) with 10 permutations to remove any variables that contributed less than 2 % to the model if their removal did not decrease the AUC score, measured by a jackknife test. The resulting predictions from these models are not likely to differ greatly from using the preselected variables and we measured their similarity by measuring their niche overlap. However, the variables chosen through this process may differ between the two species’ models, thus helping to identify differences in important environmental factors. We therefore identified any such differences and calculated the mean permutation importance of the data-driven selected variables, and investigated their response curves as described above.