Data from: Genetic diversity and ecogeographical niche overlap among hybridizing ox-eye daisies (Leucanthemum, Asteraceae) in the Carpathian Mountains: the impact of anthropogenic disturbances
Data files
Nov 22, 2024 version files 168.66 GB
-
01_filtered_reads.csv
6.42 MB
-
01_filtered_reads.RData
2.28 MB
-
01_genetic_distances.nex
1.86 MB
-
01_structure_all_individuals_and_in_silico_hybrids.str
5.18 MB
-
01_structure_all_individuals.str
4.69 MB
-
01_vcf_files.zip
686.58 KB
-
02_all_points.csv
31.14 KB
-
02_gau_absences.csv
1.29 MB
-
02_irc_absences.csv
1.22 MB
-
02_points_with_env_vars.csv
127.62 KB
-
02_rasterPCA_1.tif
6.96 MB
-
02_rasterPCA_2.tif
6.95 MB
-
02_rasterPCA_3.tif
7.06 MB
-
02_rasterPCA_4.tif
6.98 MB
-
02_rasterPCA_gau_rescaled.tif
7.54 MB
-
02_rasterPCA_irc_rescaled.tif
7.54 MB
-
02_rasterPCA_rot_rescaled.tif
7.87 MB
-
02_rot_absences.csv
1.25 MB
-
03_distance_to_nearest_point_buildings_masked.tif
20.85 GB
-
03_distance_to_nearest_point_landuse_area_masked.tif
10.22 GB
-
03_distance_to_nearest_point_place_of_worship_masked.tif
20.41 GB
-
03_distance_to_nearest_point_point_of_interest_area_masked.tif
20.70 GB
-
03_distance_to_nearest_point_point_of_interest_masked.tif
20.75 GB
-
03_distance_to_nearest_point_roads_masked.tif
16.40 GB
-
03_distance_to_nearest_point_traffic_point.tif
20.51 GB
-
03_distance_to_nearest_point_transport_masked.tif
20.56 GB
-
03_Human_Disturbance_Index_Carpathians.tif
18.18 GB
-
README.md
8.04 KB
Abstract
This study examines the ecogeographical niche overlap and genetic diversity among three Leucanthemum species in the Carpathian Mountains: the lowland L. ircutianum (4x), the montane L. rotundifolium (2x), and the alpine L. gaudinii (2x). Previous research noted hybridization between L. rotundifolium and L. gaudinii, but our analysis reveals more extensive hybridization across all three species, with additional potential hybridization events among the hybrids. Over 600 individuals were genotyped using SNP analysis, followed by Principal Coordinate Analysis (PCoA), Neighbor-Net Network, and Structure clustering. These analyses uncovered distinct genetic groups corresponding to particular taxa and supported the formation of hybrids. Within species-specific clusters, L. rotundifolium is divided into western and south-eastern populations, while L. gaudinii and L. ircutianum demonstrate close genetic relationships. Our analyses suggest hybridization between all species pairs and the potential for triple hybrids. Genetic admixture is further supported by environmental background. Niche overlap analyses reveal substantial overlap among species, particularly in line with their vertical distribution. Climate envelope plots indicate a likely increase in competition due to climate change, leading to a reduction in available habitat for mountainous species and an intensification of hybridization. Anthropogenic influences are intensifying these hybridization trends. Among the studied species, L. gaudinii is most at risk of overwhelming hybridization, whereas L. ircutianum may experience habitat expansion. This study provides insights into the intricate ecological and genetic interrelations of the Carpathian ox-eye daisies, emphasizing the role of environmental changes and genetic diversity in understanding and responding to these dynamics.
https://doi.org/10.5061/dryad.cvdncjtcd
Summary
This dataset supports a study on the genetic diversity and ecogeographical niche overlap among three species of ox-eye daisies (Leucanthemum ircutianum, L. rotundifolium, L. gaudinii, and their hybrids) in the Carpathian Mountains. The study focuses on how anthropogenic disturbances and climate change affect these species' distributions and hybridization. The dataset includes genetic data, environmental data, and human disturbance indices.
Description of the Data and File Structure
Genetic Data
- Files: Files begin with 01. Files contain:
- 01_filtered_reads.csv - filtered reads in a csv format
- 01_filtered_reads.RData - filtered reads as an R object
- 01_vcf_files.zip - filtered reads in vcf format
- 01_genetic_distances.nex - genetic distances among individuals in nexus format
- 01_structure_all_individuals.str - structure input file containg all individuals
- 01_structure_all_individuals_and_in_silico_hybrids.str - structure input file containg all individuals and in silico hybrids
- Contents: Includes SNP data for 606 individuals across 131 populations. Each file contains genetic markers used for analyses such as Principal Coordinate Analysis (PCoA) and Bayesian clustering. Genetic distances were used to create SplitsTree analysis. Structure files were used as input for Structure - there are two files: one with all individuals and additional one with in silico hybrids.
Environmental Data
- Files: Files begin with 02. Files contains:
- 02_all_points.csv - species occurrence data in csv format
- 02_points_with_env_vars.csv - species occurrence data with values from bioclimatic variable layers in csv format
- 02_rot_absences.csv - species absence data in csv format
- 02_gau_absences.csv - species absence data in csv format
- 02_irc_absences.csv - species absence data in csv format
- 02_rasterPCA_1.tif - first axis of PCA derived from environmental data in geoferenced GeoTiff format
- 02_rasterPCA_2.tif - second axis of PCA derived from environmental data in geoferenced GeoTiff format
- 02_rasterPCA_3.tif - third axis of PCA derived from environmental data in geoferenced GeoTiff format
- 02_rasterPCA_4.tif - fourth axis of PCA derived from environmental data in geoferenced GeoTiff format
- 02_rasterPCA_rot_rescaled.tif - result of niche modeling for L. rotundifolium in geoferenced GeoTiff format
- 02_rasterPCA_gau_rescaled.tif - result of niche modeling for L. gaudinii in geoferenced GeoTiff format
- 02_rasterPCA_irc_rescaled.tif - result of niche modeling for L. ircutianum in geoferenced GeoTiff format
- Contents: Environmental data for each sampling site, including bioclimatic variables and geographic coordinates. Additionally, there are three files with absences for each species that were used for modeling. PCA variables created from environmental data are included as well as models for each of the three species. All .tif files are georeferenced TIFF images saved in the EPSG:4326 (WGS 84) coordinate system, with coordinates in decimal degrees. The pixel resolution is 0.008333333300319489823,-0.008333333299999998861 (ca. 0.6 km^2). These files should be automatically recognized and correctly positioned by most GIS software.
Human Disturbance Data
- Files: Files begin with 03. Files contains:
- 03_distance_to_nearest_point_transport_masked.tif - distance in meters to the nearest point of transport in geoferenced GeoTiff format
- 03_distance_to_nearest_point_traffic_point.tif - distance in meters to the nearest point of traffic in geoferenced GeoTiff format
- 03_distance_to_nearest_point_point_of_interest_area_masked.tif - distance in meters to the nearest point of interest area in geoferenced GeoTiff format
- 03_distance_to_nearest_point_point_of_interest_masked.tif - distance in meters to the nearest point of interest in geoferenced GeoTiff format
- 03_distance_to_nearest_point_place_of_worship_masked.tif - distance in meters to the nearest point of worship in geoferenced GeoTiff format
- 03_distance_to_nearest_point_landuse_area_masked.tif - distance in meters to the nearest human related landuse area in geoferenced GeoTiff format
- 03_distance_to_nearest_point_buildings_masked.tif - distance in meters to the nearest building in geoferenced GeoTiff format
- 03_distance_to_nearest_point_roads_masked.tif - distance in meters to the nearest road in geoferenced GeoTiff format
- 03_Human_Disturbance_Index_Carpathians.tif - human disturbance index for Carpathians in geoferenced GeoTiff format
- Contents: Human Disturbance Index (HDI) raster files, representing distances to various human infrastructure elements and the HDI itself. HDI represents the impact of human infrascture on the landscape and is derived from distances to the above mentioned points and areas. All .tif files are georeferenced TIFF images saved in the EPSG:4326 (WGS 84) coordinate system, with coordinates in decimal degrees. The pixel resolution is 0.008333333300319489823,-0.008333333299999998861 (ca. 0.6 km^2). These files should be automatically recognized and correctly positioned by most GIS software.
Relationships Between Data Files
- Genetic Data: The genetic data files are linked to specific individuals and populations, which can be cross-referenced with the general data file for localization and codes.
- Environmental Data: Each occurrence in the environmental data can be matched with genetic data based on geographic coordinates.
- Human Disturbance Data: HDI files can be used to assess the impact of human infrastructure on each population's distribution, which can also be correlated with genetic and environmental data.
Missing Data Codes and Abbreviations
- Missing Data Codes: Missing data in the CSV files are denoted by "NA".
- Abbreviations:
- irc: Leucanthemum ircutianum
- rot: Leucanthemum rotundifolium
- gau: Leucanthemum gaudinii
- paw: Leucanthemum x pawlowskii
- HDI: Human Disturbance Index
Sharing/Access Information
Data Was Derived From the Following Sources:
- Field sampling across the Carpathian Mountains
- Genomic sequencing provided by DArT-Seq technology
- Environmental and climate data from CHELSA 2.1 dataset (https://chelsa-climate.org/downloads/; Karger, D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E., Linder, P., Kessler, M. (2017): Climatologies at high resolution for the Earth land surface areas. Scientific Data. 4 170122. https://doi.org/10.1038/sdata.2017.122)
- Human infrastructure data from OpenStreetMap (www.openstreetmap.org and www.geofabrik.de/data/download.html )
Code/Software
- Scripts: Analysis scripts for genetic data (R, Python), environmental niche modeling (R), and human disturbance assessment (QGIS, R).
- Software Versions:
- R: 4.1.0
- QGIS: 3.32.3
- Packages:
dartR,adegenet,ggplot2,raster,sp,phyloclim
Workflow
- Genetic Analysis:
- Preprocessing of raw sequencing data using R scripts.
- Genetic clustering and diversity assessment.
- Environmental Niche Modeling:
- Analysis of niche overlap using PCA and niche metrics.
- Future climate scenario modeling using bioclimatic layers.
- Human Disturbance Assessment:
- Generation of TIFF files in R based on the distance of objects.
- Calculation of HDI using georeferenced TIFF files in QGIS.
- Statistical analysis and visualization in R.
Data Collection
The dataset was gathered through extensive fieldwork across the Carpathian Mountains, focusing on three species of ox-eye daisies (Leucanthemum ircutianum, L. rotundifolium, L. gaudinii and their hybrids). Over 600 individual plants were sampled from various elevations and habitats to capture the genetic diversity and distribution patterns of these species.
Genetic Analysis
DNA was extracted from dried leaf samples using a DNA extraction kit. The ploidy levels of the samples were determined using flow cytometry. For genetic analysis, DArTSeq technology was employed to sequence the DNA, providing a comprehensive overview of the genetic makeup of the collected samples. The raw genetic data were filtered to remove low-quality and monomorphic loci, resulting in a dataset with high-confidence genetic markers.
Environmental Data
Environmental data were collected for each sampling site, including bioclimatic variables and geographical coordinates. These data were used to model the ecological niches of the species and assess the overlap between them. Future climate scenarios were also incorporated to predict potential changes in species distributions.
Human Disturbance
Data on human infrastructure were sourced from detailed maps (openstreetmap.org), and a Human Disturbance Index (HDI) was created. This index quantifies the impact of human activities on the habitats of the sampled species, considering factors such as proximity to roads, buildings, and other man-made structures.
Data Processing
The genetic data underwent processing to ensure quality and accuracy. Loci with low call rates and reproducibility were excluded, and individuals with high levels of missing data were removed from the analysis. Clustering algorithms were used to identify genetic groups, and simulations were conducted to validate the presence of hybrids. Environmental niche models were constructed using both current and future climate data to understand the ecological dynamics of the species.
