Data from: Morphological identification and single-cell genomics of marine diplonemids
Gawryluk, Ryan M. R. et al. (2017), Data from: Morphological identification and single-cell genomics of marine diplonemids, Dryad, Dataset, https://doi.org/10.5061/dryad.d19j0
Recent global surveys of marine biodiversity have revealed that a group of organisms known as “marine diplonemids” constitutes one of the most abundant and diverse planktonic lineages . Though discovered over a decade ago [2 and 3], their potential importance was unrecognized, and our knowledge remains restricted to a single gene amplified from environmental DNA, the 18S rRNA gene (small subunit [SSU]). Here, we use single-cell genomics (SCG) and microscopy to characterize ten marine diplonemids, isolated from a range of depths in the eastern North Pacific Ocean. Phylogenetic analysis confirms that the isolates reflect the entire range of marine diplonemid diversity, and comparisons to environmental SSU surveys show that sequences from the isolates range from rare to superabundant, including the single most common marine diplonemid known. SCG generated a total of ∼915 Mbp of assembled sequence across all ten cells and ∼4,000 protein-coding genes with homologs in the Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology database, distributed across categories expected for heterotrophic protists. Models of highly conserved genes indicate a high density of non-canonical introns, lacking conventional GT-AG splice sites. Mapping metagenomic datasets  to SCG assemblies reveals virtually no overlap, suggesting that nuclear genomic diversity is too great for representative SCG data to provide meaningful phylogenetic context to metagenomic datasets. This work provides an entry point to the future identification, isolation, and cultivation of these elusive yet ecologically important cells. The high density of nonconventional introns, however, also portends difficulty in generating accurate gene models and highlights the need for the establishment of stable cultures and transcriptomic analyses.
Off of California coastline