High-value plant metabolite production in synthetic biosystems
Data files
Jun 17, 2021 version files 31.26 GB
-
E.sinica_contig.fasta
17.73 GB
-
E.sinica_scaf.fasta
13.52 GB
Abstract
Ephedra sinica is a high-value medicinal plant that produces important phenylpropylamino alkaloids pseudoephedrine and ephedrine. Few genomics resources exist for E. sinica, which has been characterized as a tetraploid with a monoploid genome size of 8.56 Gb. Here we reported a partial genome assembly of E. sinica (12.8 Gb) based on Illumina short-read sequencing technology at low coverage.
Total genomic DNA was extracted using a standard CTAB modified with the addition of RNase A treatment and polysaccharide removal. gDNA of greater than >20kb in length was obtained. For shotgun sequencing on Illumina HiSeq X, we constructed four PCR-Free libraries with insert-size of 550 bp, and obtained 349 million 150 bp pair-ended reads (base count of 105.5 Gb). We first filtered the raw Illumina data with fastp (-q 20 -u 40 -l 51). SOAPdenovo-63mer module of the SOAPdenovo2 was used for the contig level assembly with the best kmer-size of 25. The same dataset was used for the scaffolding step.