Skip to main content

High-value plant metabolite production in synthetic biosystems

Cite this dataset

Li, Qiushi; Morris, Jeremy; Facchini, Peter; Yeaman, Sam (2021). High-value plant metabolite production in synthetic biosystems [Dataset]. Dryad.


Ephedra sinica is a high-value medicinal plant that produces important phenylpropylamino alkaloids pseudoephedrine and ephedrine. Few genomics resources exist for E. sinica, which has been characterized as a tetraploid with a monoploid genome size of 8.56 Gb. Here we reported a partial genome assembly of E. sinica (12.8 Gb) based on Illumina short-read sequencing technology at low coverage.


Total genomic DNA was extracted using a standard CTAB modified with the addition of RNase A treatment and polysaccharide removal. gDNA of greater than >20kb in length was obtained. For shotgun sequencing on Illumina HiSeq X, we constructed four PCR-Free libraries with insert-size of 550 bp, and obtained 349 million 150 bp pair-ended reads (base count of 105.5 Gb). We first filtered the raw Illumina data with fastp (-q 20 -u 40 -l 51). SOAPdenovo-63mer module of the SOAPdenovo2 was used for the contig level assembly with the best kmer-size of 25. The same dataset was used for the scaffolding step.