Data from: Beware of the impact of land use legacy on genetic connectivity: A case study of the long-lived perennial Primula veris
Data files
Mar 15, 2024 version files 74.43 KB
-
Koguva_raster_landscape.tiff
-
Lepiku_raster_landscape.tiff
-
Primula_veris_gen_diff_landscape_variables.csv
-
Primula_veris_gen_div.csv
-
README.md
Abstract
This dataset contains genetic and landscape data of 32 Primula veris populations in Muhu island in Estonia. The study populations are on two 2x2 km study landscapes. Genetic samples were collected in 2014. Landscape data was extracted from maps dated 2015. Data is divided into node- and link-based data. Node-based data contains genetic diversity data of the P. veris populations. Link-based data contains genetic differentiation between population pairs and landscape data in buffers surrounding a straight line between population pairs.
README
This Primula_veris_landscape_genetics_Readme.txt file was generated on 2024-02-28 by Iris Reinula
GENERAL INFORMATION
Title of Dataset: Data from: Beware of the impact of land use legacy on genetic connectivity: A case study of the long-lived perennial Primula veris
Author Information
A. Principal Investigator Contact Information
Name: Iris Reinula
Institution: University of Tartu, Institute of Ecology and Earth Sciences
Address: Liivi 2, 50409 Tartu, Estonia
Email: iris.reinula@ut.eeB. Associate or Co-investigator Contact Information
Name: Tsipe Aavik
Institution: University of Tartu, Institute of Ecology and Earth Sciences
Address: Liivi 2, 50409 Tartu, Estonia
Email: tsipe.aavik@ut.eeC. Alternate Contact Information
Name: Sabrina Träger
Institution: Martin-Luther-University Halle-Wittenberg, Institute of Biology/Geobotany and Botanical Garden
Address: Große Steinstr. 79/80, 06108 Halle (Saale), Germany
Email: sabrina.traeger@botanik.uni-halle.deDate of data collection (single date, range, approximate date):
genetic data: 2014
map data originally created: 2015
map data modified for this dataset: 2022-2024Geographic location of data collection: Muhu island, Estonia
Information about funding sources that supported the collection of the data:
Financial support was obtained from
the Estonian Research Council (PRG1751, MOBJD427, PUT589 and PRG874),
the European Regional Development Fund (Centre of Excellence EcolChange),
the European Commission LIFE+ Nature program (LIFE13NAT/EE/000082),
and ERA-NET program Biodiversa+ through the Estonian Ministry of Environment (Biodiversa2021-943).
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: -
Links to publications that cite or use the data:
Reinula, I., Träger, S., Järvine, H-T., Kuningas, V-M., Kaldra, M., Aavik, T. (2024).
Beware of the impact of land use legacy on genetic connectivity: A case study of the long-lived perennial Primula veris. Biological Conservation, xx.Links to other publicly accessible locations of the data:
Links/relationships to ancillary data sets: Sequence data used to generate this data will be made available at the European Nucleotide Archive (ENA)
Was data derived from another source? yes
A. If yes, list source(s):
Map data:
Estonian Basic Map (1:10000; Estonian Land Board, 2015)
aerial photos of the study areas (Estonian Land Board, 2015)Recommended citation for this dataset:
Reinula, I., Träger, S., Järvine, H-T., Kuningas, V-M., Kaldra, M., Aavik, T. (2024).
Data from: Beware of the impact of land use legacy on genetic connectivity: A case study of the long-lived perennial Primula veris, Estonia. Dryad, Dataset.
DATA & FILE OVERVIEW
File List:
Primula_veris_gen_div.csv - Genetic diversity data for Primula veris populations
Primula_veris_gen_diff_landscape_variables.csv - Genetic differentiation data for Primula veris and landscape data between the populations
Koguva_raster_landscape.tiff - Landscape data in raster format in one study landscape (Koguva)
Lepiku_raster_landscape.tiff - Landscape data in raster format in one study landscape (Lepiku)Relationship between files, if important:
Additional related data collected that was not included in the current data package: n/a
Are there multiple versions of the dataset? no
METHODOLOGICAL INFORMATION
1. Description of methods used for collection/generation of data:
To generate the genetic information, the leaves of Primula veris were collected from study populations in the two study landscapes (Koguva, Lepiku) and DNA was extracted from the leaves.
Extracted DNA was prepared for library using ddRAD method (Peterson, Weber, Kay, Fisher, & Hoekstra, 2012) and sequenced.
We obtained landscape data (grasslands, shrubs, forest, agricultural land, quarry) from Estonian Basic Map (1:10000) and modified and categorised it based on areal photos.
Map data is from Estonian Land Board.
See Reinula et al. 2024 for more info.
References:
Peterson, B. K., Weber, J. N., Kay, E. H., Fisher, H. S., & Hoekstra, H. E. (2012). Double digest RADseq: An inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS ONE, 7(5), e37135. https://doi.org/10.1371/journal.pone.0037135
Reinula, I., Träger, S., Järvine, H-T., Kuningas, V-M., Kaldra, M., Aavik, T. (2024).Beware of the impact of land use legacy on genetic connectivity: A case study of the long-lived perennial Primula veris. Biological Conservation, xx.
2. Methods for processing the data:
Genetic data was filtered geoinformatically (Träger et al. 2021) and population-based genetic diversity indices
(unbiased expected and observed heterozygosity, uHe and Ho, respectively) were calculated using GENALEX version 6.503 (Peakall & Smouse, 2005, 2012) and mean nucleotide diversity (π) was calculated using vcftools v0.1.12b (Danecek et al., 2011) within a window of 125 bp over all loci for each population. Inbreeding coefficients (FIS) and genetic differentiation (FST) were calculated using the package `genepop´ (Rousset, 2008) in R version 3.4.2 (R Core Team, 2017). Pairwise mean assignment probability (MAP) was calculated with the package AssignPop (Chen et al., 2018). For calculating MAP, we used assignment tests. We performed assignment tests for which we filtered out loci with low variance (threshold at 0.95) and used Monte-Carlo cross-validation. All loci (100%) were used as training data. The classification method for prediction was linear discriminant analysis. The resulting pairwise probabilities (membership accuracies across all individuals) were directional (e.g. 1 to 2, 2 to 1). We added these pairs together and divided them by two, resulting in one value per population pair (MAP; following van Strien et al., 2014).
Study populations were sampled at the scale of 2 2x2 km study landscapes (Koguva, Lepiku) and a 250 m buffer around the 2x2 km landscapes was added, resulting in two 2.5x2.5 km squares. We calculated the proportional amount of landscape elements surrounding the straight line between population pairs in a buffer with the width of 100 m.
We only calculated this within one landscape. We transformed the landscape data from vector data to 10x10 m raster data for resistance surface analysis.
See Reinula et al. 2024 for more info.
References:
Chen, K.-Y., Marschall, E. A., Sovic, M. G., Fries, A. C., Gibbs, H. L., & Ludsin, S. A. (2018). assignPOP: An r package for population assignment using genetic, non-genetic,
or integrated data in a machine-learning framework. Methods in Ecology and Evolution, 9(2), 439–446. https://doi.org/10.1111/2041-210X.12897
Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., Handsaker, R. E., Lunter, G., Marth, G. T., Sherry, S. T., McVean, G., & Durbin, R. (2011).
The variant call format and VCFtools. Bioinformatics, 27(15), 2156–2158. https://doi.org/10.1093/bioinformatics/btr330
Peakall, R., & Smouse, P. E. (2005). genalex 6: Genetic analysis in Excel. Population genetic software for teaching and research.
Molecular Ecology Notes, 6(1), 288–295. https://doi.org/10.1111/j.1471-8286.2005.01155.x
Peakall, R., & Smouse, P. E. (2012). GenAlEx 6.5: Genetic analysis in Excel. Population genetic software for teaching and research—an update.
Bioinformatics, 28(19), 2537–2539. https://doi.org/10.1093/bioinformatics/bts460
Rousset, F. (2008). genepop’007: A complete re-implementation of the genepop software for Windows and Linux.
Molecular Ecology Resources, 8(1), 103–106. https://doi.org/10.1111/j.1471-8286.2007.01931.x
R Core Team. (2017). R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Reinula, I., Träger, S., Järvine, H-T., Kuningas, V-M., Kaldra, M., Aavik, T. (2024).
Beware of the impact of land use legacy on genetic connectivity: A case study of the long-lived perennial Primula veris. Biological Conservation, xx.
Träger, S., Rellstab, C., Reinula, I., Zemp, N., Helm, A., Holderegger, R., Aavik, T. (2021).
Genetic diversity at putatively adaptive but not neutral loci in Primula veris responds to recent habitat change in semi-natural grasslands bioRxiv 2021.05.12.442254; doi: https://doi.org/10.1101/2021.05.12.442254
van Strien, M. J., Keller, D., Holderegger, R., Ghazoul, J., Kienast, F., & Bolliger, J. (2014). Landscape genetics as a tool for conservation planning:
Predicting the effects of landscape change on gene flow. Ecological Applications, 24(2), 327–339. https://doi.org/10.1890/13-0442.1
1.Instrument- or software-specific information needed to interpret the data: n/a
2.Standards and calibration information, if appropriate: n/a
3.Environmental/experimental conditions: Experimental conditions don't apply.
Environmental conditions: samples for genetic analysis were collected during summer with mostly dry and sunny weather.
4.Describe any quality-assurance procedures performed on the data:
5.People involved with sample collection, processing, analysis and/or submission:
Iris Reinula, Sabrina Träger, Hanna-Triinu Järvine, Vete-Mari Kuningas, Marianne Kaldra, Tsipe Aavik, Marge Thetloff, Liis Kasari-Toussaint
DATA-SPECIFIC INFORMATION FOR: Primula_veris_gen_div.csv
Number of variables: 9
Number of cases/rows: 32
Delimiter: semicolon tab
Decimal operator: full stop (.)
Variable List:
- Population_ID: identification number of the population
- Region: the study landscape, either Koguva or Lepiku
- Latitude
- Longitude
- Samples: number of samples in each population
- Ho: genetic diversity index observed heterozygosity
- He: genetic diversity index unbiased expected heterozygosity
- FIS: genetic diversity index inbreeding coefficient
- π: genetic diversity index nucleotide diversity
Missing data codes: n/a
Specialized formats or other abbreviations used: n/a
DATA-SPECIFIC INFORMATION FOR: Primula_veris_gen_diff_landscape_variables.csv
Number of variables: 10
Number of cases/rows: 244
Delimiter: tab
Decimal operator: full stop (.)
Variable List:
- Population_ID_1: identification number of the first population in a pair
- Population_ID_2: identification number of the first population in a pair
- Geographical_distance_m: geographical distance in a straigt line between two populations (m)
- MAP: pairwise mean assignment probability, a genetic distance index
- FST: pairwise genetic differentiation index
- Grassland_proportion: proportional amount of grassland within the buffer zone (d = 100 m)
- surrounding the straight corridor between two populations.
- Shrubs_proportion: proportional amount of shrubs within the buffer zone (d = 100 m)
- surrounding the straight corridor between two populations.
- Agricultural_alnd_proportion: proportional amount of agricultural land within the buffer zone (d = 100 m)
- surrounding the straight corridor between two populations.
- Forest_proportion: proportional amount of forest within the buffer zone (d = 100 m)
- surrounding the straight corridor between two populations.
- Quarry_proportion: proportional amount of quarry within the buffer zone (d = 100 m)
- surrounding the straight corridor between two populations.
6. Missing data codes: NA
7. Specialized formats or other abbreviations used:n/a
DATA-SPECIFIC INFORMATION FOR: Koguva_raster_landscape.tiff
Dimensions: 250 x 250
Cell values:
1 - arable land
2 - quarry
3 - forest
4 - semi-natural grassland
5 - shrubs
6 - other landscape elements
DATA-SPECIFIC INFORMATION FOR: Lepiku_raster_landscape.tiff
Dimensions: 250 x 250
Cell values:
1 - arable land
2 - other landscape elements
3 - forest
4 - semi-natural grassland
5 - shrubs
Methods
To generate the genetic information, the leaves of Primula veris were collected from study populations and DNA was extracted from the leaves. Extracted DNA was prepared for library using ddRAD (Peterson, Weber, Kay, Fisher, & Hoekstra, 2012) method and sequenced.
Genetic data was filtered geoinformatically (see Träger et al. 2021) and population-based genetic diversity indices (unbiased expected and observed heterozygosity, uHe and Ho, respectively) were calculated using GENALEX version 6.503 (Peakall & Smouse, 2005, 2012) and mean nucleotide diversity (π) was calculated using vcftools v0.1.12b (Danecek et al., 2011) within a window of 125 bp over all loci for each population. Inbreeding coefficients (FIS) and genetic differentiation (FST) were calculated using the package `genepop´ (Rousset, 2008) in R version 3.4.2 (R Core Team, 2017).
Pairwise mean assignment probability (MAP) was calculated with the package AssignPop (Chen et al., 2018). For calculating MAP, we used assignment tests.
We performed assignment tests for which we filtered out loci with low variance (threshold at 0.95) and used Monte-Carlo cross-validation. All loci (100%) were used as training data.
The classification method for prediction was linear discriminant analysis. The resulting pairwise probabilities (membership accuracies across all individuals)
were directional (e.g. 1 to 2, 2 to 1). We added these pairs together and divided them by two, resulting in one value per population pair (MAP; following van Strien et al., 2014).
Study populations were sampled at the scale of 2 2x2 km study landscapes (Koguva, Lepiku) and a 250 m buffer around the 2x2 km landscapes was added, resulting in two 2.5x2.5 km squares. We calculated the proportional amount of landscape elements surrounding the straight line between population pairs in a buffer with a width of 100 m.
We only calculated this within one landscape. We transformed the landscape data from vector data to 10x10 m raster data for resistance surface analysis.
References:
- Chen, K.-Y., Marschall, E. A., Sovic, M. G., Fries, A. C., Gibbs, H. L., & Ludsin, S. A. (2018). assignPOP: An r package for population assignment using genetic, non-genetic,
or integrated data in a machine-learning framework. Methods in Ecology and Evolution, 9(2), 439–446. https://doi.org/10.1111/2041-210X.12897 - Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., Handsaker, R. E., Lunter, G., Marth, G. T., Sherry, S. T., McVean, G., & Durbin, R. (2011).
The variant call format and VCFtools. Bioinformatics, 27(15), 2156–2158. https://doi.org/10.1093/bioinformatics/btr330 - Peakall, R., & Smouse, P. E. (2005). genalex 6: Genetic analysis in Excel. Population genetic software for teaching and research.
Molecular Ecology Notes, 6(1), 288–295. https://doi.org/10.1111/j.1471-8286.2005.01155.x - Peakall, R., & Smouse, P. E. (2012). GenAlEx 6.5: Genetic analysis in Excel. Population genetic software for teaching and research—an update.
Bioinformatics, 28(19), 2537–2539. https://doi.org/10.1093/bioinformatics/bts460 - Rousset, F. (2008). genepop’007: A complete re-implementation of the genepop software for Windows and Linux.
Molecular Ecology Resources, 8(1), 103–106. https://doi.org/10.1111/j.1471-8286.2007.01931.x - R Core Team. (2017). R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. - Reinula, I., Träger, S., Järvine, H-T., Kuningas, V-M., Kaldra, M., Aavik, T. (2024).
Beware of the impact of land use legacy on genetic connectivity: A case study of the long-lived perennial Primula veris. Biological Conservation, xx. - Träger, S., Rellstab, C., Reinula, I., Zemp, N., Helm, A., Holderegger, R., Aavik, T. (2021).
Genetic diversity at putatively adaptive but not neutral loci in Primula veris responds to recent habitat change in semi-natural grasslands bioRxiv 2021.05.12.442254; doi: https://doi.org/10.1101/2021.05.12.442254 - van Strien, M. J., Keller, D., Holderegger, R., Ghazoul, J., Kienast, F., & Bolliger, J. (2014). Landscape genetics as a tool for conservation planning:
Predicting the effects of landscape change on gene flow. Ecological Applications, 24(2), 327–339. https://doi.org/10.1890/13-0442.1