Skip to main content
Dryad

Genome sequencing of four culinary herbs reveals terpenoid genes underlying chemodiversity in the Nepetoideae

Cite this dataset

Bornowski, Nolan et al. (2020). Genome sequencing of four culinary herbs reveals terpenoid genes underlying chemodiversity in the Nepetoideae [Dataset]. Dryad. https://doi.org/10.5061/dryad.jwstqjq6t

Abstract

Species within the mint family, Lamiaceae, are widely used for their culinary, cultural, and medicinal properties due to production of a wide variety of specialized metabolites, especially terpenoids. To further our understanding of genome diversity in the Lamiaceae and to provide a resource for mining biochemical pathways, we generated high-quality genome assemblies of four economically important culinary herbs, namely, sweet basil (Ocimum basilicum L.), sweet marjoram (Origanum majorana L.), oregano (Origanum vulgare L.), and rosemary (Rosmarinus officinalis L.), and characterized their terpenoid diversity through metabolite profiling and genomic analyses. A total 25 monoterpenes and 11 sesquiterpenes were identified in leaf tissue from the four species. Genes encoding enzymes responsible for the biosynthesis of precursors for mono- and sesqui-terpene synthases were identified in all four species. Across all four species, a total of 235 terpene synthases were identified, ranging from 27 in O. majorana to 137 in the tetraploid O. basilicum. This study provides valuable resources for further investigation of the genetic basis of chemodiversity in these important culinary herbs.

Usage notes

- aa.min_10k_final.fa

fasta genome assembly with minimum 10k scaffold size and no N scaffolds

- aa.working_models.cdna.fa

cDNA sequences of all isoforms of the working gene set

- aa.working_models.cds.fa

CDS of all isoforms of the working gene set

- aa.working_models.gff3

GFF of all isoforms of the working gene set

- aa.working_models.pep.fa

Peptide sequence of all isoforms of the working gene set

- aa.working_models.pep.func_anno.txt

Functional annotation of the working gene set

- aa.gene_models.hc.cdna.fa

cDNA sequences of all high confidence gene models

- aa.gene_models.hc.cds.fa

CDS of all high confidence gene models

- aa.gene_models.hc.gff3

GFF of all high confidence gene models

- aa.gene_models.hc.pep.fa

Peptide sequences of all high confidence gene models

- aa.gene_models.hc.pep.func_anno.txt

Functional annotation of the high confidence gene set

- aa.gene_models.hc.repr.gff3

GFF of high confidence representative gene models

- aa.gene_models.hc.repr.pep.fa

Peptide sequences of high confidence representative gene models

- Cufflinks_FPKMs_functions_all_hc_genes.txt

Expression abundances of the high confidence gene model set

- Cufflinks_FPKMs_functions_working_genes.txt

Expression abundances of the working gene model set

- Readme_culinary_herbs_DataDryad_files.docx

Readme

Funding

National Science Foundation, Award: IOS-1444499

United States Department of Agriculture, Award: 177845