Skip to main content
Dryad

Data from: A phased chromosome-level genome of the annelid tubeworm Galeolaria caespitosa

Data files

Sep 11, 2025 version files 511.52 MB

Click names to download individual files

Abstract

Haplotype-resolved (phased) genome assemblies are emerging as important assets for genomic studies of species with high heterozygosity, but remain lacking for key animal lineages. Here, we use PacBio HiFi and Omni-C technologies to assemble the first phased, annotated, chromosome-level genome for any annelid: the reef-building tubeworm Galeolaria caespitosa (Serpulidae). The assembly is 803.5 Mbp long (scaffold N50 = 76.5 Mbp) for haplotype 1 and 789.3 Mbp long (scaffold N50 = 75.4 Mbp) for haplotype 2, which are arranged into 11 pairs of chromosomes showing no sign of sex chromosomes. This compares with cytological analyses reporting 12–13 pairs in G. caespitosa’s closest relatives, including species that are protandrous hermaphrodites. We combined long-read and short-read transcriptome sequencing to annotate both haplotypes, resulting in 30,495 predicted proteins for haplotype 1, 27,423 proteins for haplotype two, and 79.5% of proteins with at least one functional annotation. We also assembled a mitochondrial genome 23 Kbp long, annotating all genes typically found in mitochondrial DNA apart from those coding the 16S ribosomal subunit (rrnL) and the protein atp8 — a short, fast-evolving mitochondrial gene missing in other metazoans. Comparing G. caespitosa’s genome to those of three other annelids reveals limited collinearity despite 36.0% of shared orthologous gene clusters (4,238 of 11,763 clusters counted in G. caespitosa), suggesting extensive chromosomal rearrangements among lineages. New high-quality annelid genomes may help resolve the genetic and evolutionary basis of this diversity.