Chloroplast genome of the critically endangered ginger Zingiber odoriferum Blume from Java, Indonesia: Characterization, comparison, and conservation insights
Data files
Aug 19, 2025 version files 167.36 KB
-
README.md
1.75 KB
-
Zingodor_cpGenome_fasta.fasta
165.60 KB
Abstract
Zingiber odoriferum (Zingiberaceae), a critically endangered ginger species endemic to Java, Indonesia, is ecologically vital yet genomically understudied. We present the first complete chloroplast (cp) genome (163,538 bp), revealing a quadripartite structure with a large single-copy region (88,033 bp), a small single-copy region (15,885 bp), and a pair of inverted repeat regions (29,810 bp each), with a total GC content of 36.04%. The genome encodes 133 genes, including protein-coding genes, tRNA genes, rRNA genes, and pseudogenized ycf1 – a conserved Zingiberaceae feature. Leucine (10.33%) and cysteine (1.14%) are the most and least abundant amino acids, respectively. Codon usage favors A/T-rich codons (e.g., AGA [Arg], TTA [Leu]), while C/G-ending codons (e.g., AGC [Ser]) are underutilized. Hypervariable loci in non-coding regions and repeats are identified as molecular markers. Phylogenetic analysis places Z. odoriferum sister to Z. teres and Z. smilesianum (100% bootstrap support). Despite high genome-wide conservation, intergenic spacers exhibit 5–12% divergence, aiding species delineation. However, the limited cp diversity across Zingiber underscores the need for nuclear genomic integration to resolve cryptic diversity and enhance conservation strategies. This study advances genomic resources for Z. odoriferum, highlights cp genome evolutionary constraints, and provides actionable markers for biodiversity monitoring. It emphasizes the urgency of multi-omics approaches to safeguard this species and its threatened tropical habitat.
Dataset DOI: 10.5061/dryad.pk0p2nh23
Description of the data and file structure
Data source: Fresh leaf samples of Zingiber odoriferum were collected from Bogor Botanic Gardens, Indonesia. Total genomic DNA was extracted using the CTAB method and sequenced on an Illumina platform. The chloroplast genome was assembled de novo to generate a FASTA file of Zingiber odoriferum.
Data description: This dataset contains the FASTA file of the complete chloroplast genome sequence
The total length of the chloroplast genome is 163,538 bp with a typical quadripartite structure: a large single-copy (LSC) region, a small single-copy (SSC) region, and two inverted repeats (IRa and IRb). A total of 133 unique genes were annotated, including protein-coding genes, tRNAs, and rRNAs.
Files and variables
File: Zingodor_cpGenome_fasta.fasta
Description: This FASTA file contains the complete chloroplast genome sequence of Zingiber odoriferum (family Zingiberaceae). The genome was de novo assembled from Illumina paired-end sequencing data. The assembled plastome has a typical quadripartite structure composed of a large single-copy (LSC) region, a small single-copy (SSC) region, and a pair of inverted repeats (IRs). The sequence is circular and provided as a single continuous entry in standard FASTA format with nucleotide bases (A, T, C, G).
Access information
Other publicly accessible locations of the data:
Both the nucleotide sequence data and the associated SRA data have been submitted to NCBI and are currently awaiting release.
