Genotypes and geographic positions of 5797 European white oaks from 636 locations genotyped at 355 nuclear SNPs and 28 maternally inherited SNPs of the chloroplast and mitochondria
Data files
Nov 24, 2023 version files 10.03 MB
-
coding_genotypes.csv
-
genotypes.csv
-
README.md
Abstract
The data set is the result of genetic inventory on 5797 white oaks collected at 636 locations all over Europe. The oaks trees were assigned in forest inventories as Quercus robur L. (3342), Quercus petraea Matt. (2090), Quercus pubescens Willd. (170) or as unspecified Quercus. spp. (195). The sampling had a focus on central and east Europe as well as the Black Sea and Caucasus region. All individuals were genotyped at 355 nuclear SNPs and 28 maternally inherited SNPs of the chloroplast and mitochondria. The combination of the maternally inherited SNPs resulted in 26 different haplotypes.
The genotype of each individual is one row in the csv-file “genotypes”. The genotypes at the nuclear markers are diploid and represented by two columns per gene marker. The genetic information at the organelle genome is haploid. For each of these gene markers one column is used. Genotypes are coded by Arabic numbers. The meaning of the numbers is explained in the table “coding genotypes” in a second csv-file. Each Individual has a unique "Genotype_ID" and a "Thuenen_Sample_ID". The "Thuenen_Sample_ID" is a unique ID that serves to identify the sample in our depository at the Thuenen Institute of Forest Genetics. Each individual has data on the geographic origin given as "Longitude" and "Latitude" in decimal degrees. For each individual the putative oak species ("Putative species") as it has been assigned in the forest inventories is given. The numbers of the "Haplotype" represent the multilocus combination of the mitochondrial and chloroplast SNPs of that individual.
README
The genotype of each individual is one row in the csv-table “genotypes”.
Each Individual has a unique "Genotype_ID" and a "Thuenen_Sample_ID".
The "Thuenen_Sample_ID" is a unique ID that serves to identify the sample in our depository at the Thuenen Institute of Forest Genetics.
Each individual has data on the geographic origin given as "Longitude" and "Latitude" in decimal degrees.
For each individual the putative oak species ("Putative species") as it has been assigned in the forest inventories is given.
The numbers of the "Haplotype" represent the multilocus combination of the mitochondrial and chloroplast SNPs of that individual.
The genotypes at the nuclear markers are diploid and represented by two columns (a + b) per gene marker.
The genetic information at the organelle genome is haploid.
For each of these gene markers one column is used. Genotypes are coded by Arabic numbers.
The meaning of the numbers is explained in the csv-table “coding genotypes”.
Methods
We collected samples (cambium or leaves) from 5,797 white oak trees at 636 locations all over Europe. The majority of the samples came from central and eastern Europe as well as from the Black Sea and Caucasus region. As putative species, the samples included 3342 Q. robur, 2090 Q. petraea, 170 Q. pubescens and 195 unspecified white oak samples. The presumed species of the samples were derived from the species classification of the stands of origin done by forests experts in frame of forest inventories. The sample size varied be-tween 1 and 48 trees at each location. The vast majority (95%) of the samples were collected in forest stands, whereas few samples were taken from provenance trials. Most material is from adult trees with diameters at breast height above 20 cm.
For all samples, 355 nuclear SNPs and 28 maternally inherited chloroplast and mitochondria SNPs were analysed based on targeted genotyping by sequencing. The details of the used gene markers are given by Degen et al. (2021).