Molecular phylogenetics of sessile Dolium sedentarium, a petalomonad euglenid
Data files
Aug 28, 2023 version files 299.30 MB
-
cell_images.zip
-
Dolium_sedentarium_SMS11_transcripts.fasta
-
Dolium_sedentarium_SMS11_transcripts.pep
-
multigene.zip
-
multiqc_report.html
-
multiqc-data.zip
-
README.md
-
SSU.zip
Abstract
The euglenids are a species-rich group of flagellates with varying modes of nutrition that can be found in diverse habitats. Phagotrophic members of this group gave rise to phototrophs and hold the key to understanding the evolution of euglenids as a whole, including the evolution of complex morphological characters like the euglenid pellicle. Yet to understand the evolution of these characters, a comprehensive sampling of molecular data is needed to correlate morphological and molecular data and to estimate a basic phylogenetic backbone of the group. While the availability of SSU rDNA and, more recently, multigene data from phagotrophic euglenids has improved, several ‘orphan’ taxa remain without any molecular data whatsoever. Dolium sedentarium is one such taxon: It is a rarely-observed phagotrophic euglenid that inhabits tropical benthic environments and is one of the few known sessile euglenids. Based on morphological characters, it has been thought of as part of the earliest branch of euglenids, the Petalomonadida. We report the first molecular sequencing data for Dolium using single-cell transcriptomics. Both SSU rDNA and multigene phylogenies confirm it as a solitary branch within Petalomonadida.
Methods
A single cell of Dolium sedentarium was collected from Spaanse Water mangrove in Curaçao in April 2022 (12.070621, -68.860269) via manual isolation with a micropipette, imaged on a Leica DLIM inverted microscope at 630x and a Sony a7RIII camera, and deposited in lysis buffer (see Picelli et al 2014). After 4 freeze-thaw cycles for lysis, a single-cell transcriptome was generated with the SmartSeq2 protocol (Picelli et al 2014), and sequenced on an Illumina MiSeq (2x250bp), multiplexed with other libraries.
Raw reads were read-corrected with rcorrector (version 1.0.4), adapter- and quality-trimmed with trimmomatic (version 0.39), and assembled with rnaSPAdes (version 3.15.1). Coding regions were determined with Transdecoder (version 5.5.0).
Additional assembly metrics shown in the multiQC report were generated with QUAST (version 5.0.2) and BUSCO (version 5.4.3).
SSU-rDNA sequences from the assembly were extracted with barrnap (version 0.9), and those sequences were blasted against GenBank's NCBI nr/nt to identify the euglenid SSU rDNA. This sequence was then appended to an existing dataset (Lax et al 2023), aligned with MAFFT E-INS-I (version 7.481), and trimmed with trimAl (version 1.2rev59, -gt 0.5 -st 0.001). A phylogeny was estimated with RAxML-NG under model GTR+GAMMA with 1000 non-parametric bootstraps (version 1.1.0).
The phylogenomic dataset was based on a 19-gene dataset (Lax et al 2023), and homologs were extracted as described previously (Lax et al 2023). All 19 genes had 2 rounds of single-gene tree checking to exclude paraloguous and contaminant sequences. After concatenation of all 19 protein alignments, 9 orthologs from Dolium sedentarium were retained. Two analyses were run on this base 19-gene, 63-taxon dataset:
1) CAT_63S19F_UFB.treefile: IQ-TREE (version 2.2.0) under LG+C60+F+G and 1,000 Ultrafast bootstraps
2) CAT_63S19F_PMSF.treefile: IQ-TREE (version 2.2.0) under LG+C60+F+G+PMSF and 500 non-parametric bootstraps (using the tree from 1 as a guide-tree)
Usage notes
Alignment-files in .fasta format can be opened with any text editor (e.g. SublimeText) or alignment viewers like AliView or SeaView.
Assemblies (.fasta and .pep) files can be opened with any text editor like SublimeText.
Tree-files in NEWICK-format (ending in .treefile or .tre) can be opened with a tree viewer like FigTree, Archaeopteryx, or iTOL EMBL.
The multiQC-report (ending in .html) can be opened with any browser.