Data from: Cannabis labelling is associated with genetic variation in terpene synthase genes
Cite this dataset
Watts, Sophie et al. (2021). Data from: Cannabis labelling is associated with genetic variation in terpene synthase genes [Dataset]. Dryad. https://doi.org/10.5061/dryad.gqnk98smm
Genetic data consisting of >100,000 single nucleotide polymorphisms (SNPs) collected using genotype-by-sequencing, from 137 drug-type Cannabis samples from the Netherlands. This genetic data along with terpene and cannabinoid content data collected with GC-FID, was used to analyze Cannabis labelling and to perfom a genome-wide association study.
The DNA sequence data are available as NCBI BioProject PRJNA713792. Calling of single nucleotide polymorphisms (SNPs) was performed in TASSEL (version 5.0) by aligning to the CBDRx reference genome. The SNP data were filtered using PLINK to exclude SNPs with a MAF <0.05 and SNPs with excess heterozygosity. The final SNP data set used for GWAS consisted of 116,296 SNPs from 137 samples. For PCA, 1,257 unanchored SNPs were removed and the remaining 115,039 SNPs were LD-pruned using PLINK resulting in 80,939 SNPs.