Alignments used for phylogenetic analysis of capirona (Calycophyllum spruceanum, Rubiaceae)
Data files
Sep 24, 2023 version files 4.84 MB
-
aligned_mafft_allfastafiles.phy
-
README.md
Abstract
Capirona (Calycophyllum spruceanum Benth.) belongs to subfamily Ixoroideae, one of the major lineages in the Rubiaceae family, and is an important timber tree, with origin in the Amazon Basin and has widespread distribution in Bolivia, Peru, Colombia, and Brazil. In this study, we obtained the first complete chloroplast (cp) genome of capirona from department of Madre de Dios located in the Peruvian Amazon. High-quality genomic DNA was used to construct librar-ies. Pair-end clean reads were obtained by PE 150 library and the Illumina HiSeq 2500 platform. The complete cp genome of C. spruceanum has a 154,480 bp in length with typical quadripartite structure, containing a large single copy (LSC) region (84,813 bp) and a small single-copy (SSC) region (18,101 bp), separated by two inverted repeat (IR) regions (25,783 bp). The annotation of C. spruceanum cp genome predicted 87 protein-coding genes (CDS), eight ribosomal RNA (rRNA) genes, 37 transfer RNA (tRNA) genes and one pseudogene. A total of 41 simple sequence repeats (SSR) of this cp genome were divided into mononucleotides (29), dinucleotides (5), trinucleotides (3), and tetranucleotide (4). Most of these repeats were distributed in the noncoding regions. Whole chloroplast genome comparison with the other six Ixoroideae species revealed that the small single copy and large single copy regions showed more divergence than invert regions. Finally, phylogenetic analysis resolved that C. spruceanum is a sister species to Emmenopterys henryi, and confirms its position within the subfamily Ixoroideae. This study reports for the first time the genome organization, gene content, and structural features of the chloroplast genome of C. spruceanum, providing valuable information for genetic and evolutionary studies in the genus Calycophyllum and beyond.
README
This file represents the complete chloroplast sequence alignments of 19 species that belong to Rubiaceace family.
Lonicera hispida (Caprifoliaceae) was included as an outgroup.
In this work, we sequenced the complete chloroplast genome of Calycophyllum spruceanum.
The overall length of the C. sprucearum chloroplast genome is 154,480 bp, exhibiting the circular quadripartite structure characteristic of major
angiosperm plants. After annotation and modification, the entire chloroplast (cp) genome sequence was submitted to the GenBank database with
accession number: OK326865 (https://www.ncbi.nlm.nih.gov/nuccore/OK326865.1/). The associated Bioproject, Biosample and SRA numbers are
PRJNA760977, SAMN21240132, and SRR15725575, respectively.
Methods
A single capirona tree was selected to be sequenced from San Bernardo Research Station of INIA, located in Madre de Dios department (2°41´8.66" N / 69° 22´49.8" E / 227.2 m.a.s.l ) in the Peruvian Amazon. A branch with flowers was collected and deposited at the Scientific Collection of the Herbarium of Universidad Nacional Mayor de San Marcos (UNMSM), under the voucher number No 324323. Total genomic DNA was extracted from fresh leaves by CTAB method with minor modifications, the quality was evaluated on a 1% agarose gel and the quantification was performed by fluorescence. 2.2. DNA Sequence and Genome Assembly High-quality genomic DNA was used to construct libraries. Pair-end clean reads were obtained by PE 150 library and the Illumina HiSeq 2500 platform. Adapters and low-quality reads were removed using Trim Galore. We used clean data to assemble the chloroplast genome using the GetOrganelle v1.7.2 pipeline, in which SPAdes v3.11.1, bowtie2 v2.4.2 and BLAST+ v2.11 were employed.
To gain an insight into the phylogenetic location of Calycophyllum. sprucearum, a maximum-likelihood (ML) tree was constructed with 1,000 nonparametric bootstrap replicates using RAxML v8.2.11 software under GTR+GAMMA nucleotide substitution model of evolution. The complete chloroplast genome of C. sprucearum was compared and aligned with other 19 chloroplast genomes obtained from Genbank by the MAFFT v7.475 software. Seven species from Rubioideae, five species from Cinchonoideae, and six species from Ixoroideae were included in the analysis. Lonicera hispida (Caprifoliaceae) was included as an outgroup.