Data from: Cannabis labelling is associated with genetic variation in terpene synthase genes
Data files
Jul 30, 2021 version files 178.24 MB
-
20191209_bedrocan_gen_filtered_pruned.raw
23.38 MB
-
20191209_bedrocan_gen_filtered.genome
819.90 KB
-
20191209_bedrocan_gen_filtered.hmp.txt
53.34 MB
-
20191209_bedrocan_gen_filtered.map
2.99 MB
-
20191209_bedrocan_gen_filtered.mdist
165.92 KB
-
20191209_bedrocan_gen_filtered.mdist.id
1 KB
-
20191209_bedrocan_gen_filtered.nosex
1 KB
-
20191209_bedrocan_gen_filtered.ped
63.73 MB
-
202008011_bedrocan_gen_filtered.raw
33.59 MB
-
2020080607_bedro_kinship.txt
222.40 KB
-
README_dryad.md
1.34 KB
Abstract
Genetic data consisting of >100,000 single nucleotide polymorphisms (SNPs) collected using genotype-by-sequencing, from 137 drug-type Cannabis samples from the Netherlands. This genetic data along with terpene and cannabinoid content data collected with GC-FID, was used to analyze Cannabis labelling and to perfom a genome-wide association study.
The DNA sequence data are available as NCBI BioProject PRJNA713792. Calling of single nucleotide polymorphisms (SNPs) was performed in TASSEL (version 5.0) by aligning to the CBDRx reference genome. The SNP data were filtered using PLINK to exclude SNPs with a MAF <0.05 and SNPs with excess heterozygosity. The final SNP data set used for GWAS consisted of 116,296 SNPs from 137 samples. For PCA, 1,257 unanchored SNPs were removed and the remaining 115,039 SNPs were LD-pruned using PLINK resulting in 80,939 SNPs.