Skip to main content

Gene copy number is associated with phytochemistry in Cannabis sativa

Cite this dataset

Vergara, Daniela (2019). Gene copy number is associated with phytochemistry in Cannabis sativa [Dataset]. Dryad.


Gene copy number variation is known to be important in nearly every species where it has been examined. Alterations in gene copy number may provide a fast way of acquiring diversity, allowing rapid adaptation under strong selective pressures, and may also be a key component of standing genetic variation within species. Cannabis sativa plants produce a distinguishing set of secondary metabolites, the cannabinoids, many of which have medicinal utility. Two major cannabinoids -THCA and CBDA - are products of a three-step biochemical pathway. Using whole genome shotgun sequence data for 69 Cannabis cultivars from diverse lineages within the species, we found that genes encoding the synthases in this pathway vary in copy number. Transcriptome sequence data shows that the cannabinoid paralogs are differentially expressed among lineages within the species. We also found that copy number partially explains variation in cannabinoid content levels among Cannabis plants. Our results demonstrate that biosynthetic genes found at multiple points in the pathway could be useful for breeding purposes, and suggest that natural and artificial selection have shaped copy number variation. Truncations in specific paralogs are associated with lack of production of particular cannabinoids, showing how phytochemical diversity can evolve through a complex combination of processes.