Suturing fragmented landscapes: Mosaic hybrid zones in plants may facilitate landscape restoration
Data files
Feb 08, 2025 version files 121.79 MB
-
Dryad_archive.zip
121.79 MB
-
README.md
6.34 KB
Abstract
Many widespread plant taxa of western North America have diversified into phenotypically and genetically divergent lineages due to complex biogeographic histories across heterogeneous landscapes. Mosaic hybrid zones can form when geographically co-occurring yet environmentally distinct lineages cross-pollinate and form hybrids that occupy unique environmental niches absent of a geographic cline. This expands the total environmental space across which parental and hybrid individuals grow, resulting in larger, less fragmented geographic distributions. Here, we highlight hybridization mosaics across three study systems containing taxa critical to widespread plant communities in western North America: Ericameria nauseosa, Artemisia tridentata, and Sphaeralcea fendleri. The systems contain diverged taxa that co-occur geographically across the landscape, though not environmentally, and hybridize readily. Hybridization among taxa has facilitated niche expansion into intermediate environments consistent with unique combinations of adaptive genetic variation, creating more continuity within each study system (study systems occupy ~820–270,000 km2 more geographic space by virtue of hybridization). Furthermore, hybrids are predicted to play important roles in future climates (they occupy 8–475% more geographic area compared to present in a high emission climate scenario). This convergent pattern signals mosaic hybridization as an underappreciated mechanism with broad ecological and evolutionary ramifications and suggests understanding the consequences of this process may be critical to predict responses of taxa to changing climates. Moreover, leveraging mosaic hybridization may assist the creation of restoration management plans that aim to mitigate the deleterious effects of habitat fragmentation on ecosystems in the context of climate change.
https://doi.org/10.5061/dryad.1g1jwsv6r
Description of the data and file structure
Directories are associated with each of the three taxa.
R_files:
- Rmd for analyses for each of the three taxa
- Imports.R file with custom R functions
- Rmd for final figure generation for main text and SI
Genotype:
- Filtered VCF file
- genotype probability file from Entropy
- Pop_ID is the individual IDs, lineage assignment, ancestry coefficients, and PC axes in the same order as the filtered vcf file and the genotype probability file
Files and variables
Folder: R_files
File: Imports.R
Description: Custom R functions to be called by other R scripts
File: hybrid_figures.R
Description: R file to make figures for the main text
File: hybrid_figures_SI.R
Description: R file to make figures for the SI text.
File: ARTR_hybrid.R
Description: All analyses for Artemisia tridentata datasets.
File: ERNA_hybrid.R
Description: All analyses for Ericameria nauseosa datasets.
File: Sph_hybrid.R
Description: All analyses for Sphaeralcea species datasets.
Folder(s): A. tridentata, E. nauseosa, Sphaeralcea
Description: Each folder contains the same files in the same format for each taxa.
File: XXX.vcf.gz
Description: Filtered VCF file
File: gprobAll.txt
Description: Genotype probability file created with Entropy (rows = indv, columns = loci)
File: Pop_ID.csv
Description: Population ID file with rows in the same order as the vcf and gprob file
Variables
- Sp: Species
- Pop: Population
- Sp_Pop: 'Species_Population'
- ID: Individual
- All: Full population name 'Species_Pop_ID'
- long: Longitude
- lat: Latitude
- A1/A3: Ancestry coefficients as estimated by entropy
- Sp_Anc: Species and estimated ancestral classification (parental species or hybrid)
- Anc: Estimated ancestral classification (parental species or hybrid)
- Label2 (for Sph only): Estimated ancestral classification (S. fendleri, Other Sph species, or hybrid)
- PC1: PC1
- PC2: PC2
- PC3: PC3
- PC4: PC4
- PC5: PC5
Code/software
| E. nauseosa | A. tridentata | Sphaeralcea** sp.** | |
|---|---|---|---|
| Reduced representation libraries | ddRADseq | ddRADseq | ddRADseq |
| restriction enzymes | EcoRI & MseI | EcoRI & MseI | EcoRI & MspI |
| Pippin prep size selection | 350-450bp | 350-450bp | 400-600bp |
| sequencing type | single-end | single-end | single-end |
| sequencing platform | Illumina HiSeq 4000 | Illumina NovaSeq 6000 | Illumina NovaSeq 6000 |
| sequencing facility | University of Wisconsin | UTGSAF | University of Oregon |
| decontamination / | github.com/ncgr/tapioca | github.com/ncgr/tapioca | *process_radtags *(STACKS) |
| demultiplexing | custom Perl script | custom Perl script | *fastq-multx *(Aronesty 2011) |
| avg. reads per indv. | 2276267 | 2276703 | 1872931 |
| reference assembly | de novo | genome | de novo |
| reference assembly method | cd-hit-est v4.8.1 | Melton *et al., *2022 | STACKS v2.6 |
| alignment method | bwa-mem v0.7.17 | bwa-mem v0.7.17 | STACKS v2.6 |
| variant calling method | bcftools v1.9 | bcftools v1.9 | STACKS v2.6 |
| num. of indv. | 600 | 407 | 366 |
| num. of loci | 1613057 | 2766918 | 88140 |
| filtering method | vcftools v0.1.16 | vcftools v0.1.16 | vcftools v0.1.16 |
| minor allele freq. (MAF) | 0.02 | 0.01 | 0.02 |
| max. missing data (loci) | 30% | 40% | 40% |
| thinned | 100bp | 100bp | 100bp |
| max. missing data (indv.) | 40% | 50% | 50% |
| min. mean read depth | 3 | 2 | 10 |
| max. mean read depth | 25 | 25 | 50 |
| min. quality | 750 | 100 | 999 |
| min. Fis | -0.5 | -0.5 | -0.5 |
| final mean | 7.25x | 3.18x | 24.69x |
| final num. of indv. | 586 | 397 | 366 |
| final num. of loci | 22917 | 13003 | 7271 |
Access information
Access information
Other publicly accessible locations of the data:
