Skip to main content

Data from: Is Hydroides dianthus (Verrill, 1873) really a Mediterranean native? Increased sampling in the eastern United States reveals enhanced genetic diversity

Cite this dataset

Davinack, Andrew (2024). Data from: Is Hydroides dianthus (Verrill, 1873) really a Mediterranean native? Increased sampling in the eastern United States reveals enhanced genetic diversity [Dataset]. Dryad.


The introduction of non-indigenous species (NIS) is a significant threat to marine biodiversity, facilitated by vectors such as shipping and aquaculture. Hydroides dianthus, a tubicolous polychaete worm, is known for its biofouling capabilities, impacting both shipping and aquaculture. Traditionally, the east coast of the United States has been considered the native range of H. dianthus. However, previous studies have suggested the Mediterranean region as the species' true native range based on higher genetic diversity. This study aims to re-evaluate the genetic diversity patterns of H. dianthus on the east coast of the United States by expanding the cytochrome c oxidase I (COI) dataset currently available for the species. Samples were collected from various locations on the east coast and analyzed using DNA barcoding. The results revealed a three-fold increase in haplotype diversity on the east coast compared to previous findings. A hierarchical AMOVA indicated significant genetic structuring between the Mediterranean and U.S. populations (ϕST = 0.51, P < 0.05). Despite a higher genetic diversity in the Mediterranean, this study highlights the variability of genetic diversity estimates and the challenges in using such metrics to delineate native ranges. Factors such as multiple introductions, genetic drift, and sampling bias can significantly alter genetic variability within populations. The findings suggest that the east coast's genetic diversity is likely underestimated and that more comprehensive data, including high-throughput genomic analyses and ecological studies, are needed to determine the native range of H. dianthus conclusively. This study underscores the complexity of using genetic data to trace the biogeography and invasion pathways of marine species.

README: Is Hydroides dianthus (Verrill, 1873) really a Mediterranean native? Increased sampling in the eastern United States reveals enhanced genetic diversity.

This dataset contains the raw DNA sequence files for the cytochrome c oxidase I (COI) gene for Hydroides dianthus collected from various localities along the eastern United States - this includes both the .seq file and the .ab1 trace files. In addition, a cleaned and edited DNA sequence alignment containing combined sequences from the Mediterranean Sea (obtained from Sun et al. (2017) Mar Biol 164: 28) is also provided. Finally, the python script used to remove gaps in the alignment to produce the cleaned sequence file is also provided. The results of the analyses were submitted to the journal: Marine Biology Research (submission id: 240351363)

Description of the data and file structure

 The .ab1 files can be checked to examine sequence quality while the .seq files can be used and repurposed for other phylogenetic/phylogeographic analysis. The cleaned aligned sequence file can be used as is or parsed for population genetic analyses. The python script can simply copied and pasted into a Python environment and run (assuming the biopython package is installed)


The python script was written using the Biopython package and executed in the Spyder GUI associated with the Anaconda environment


A section of the posterior end of each worm (~1 mm) was removed and digested in a Proteinase K and lysis buffer solution (Qiagen, Hilden, Germany). Genomic DNA was extracted using the DNeasy Blood and Tissue following the manufacturer’s protocol (Qiagen, Hilden, Germany). DNA quality of all samples was checked on a Nanodrop (Thermofisher) and ranged from 10 to 240 ng/ml. We attempted to amplify a fragment of the cytochrome c oxidase I (COI) gene using the species-specific primers and PCR protocols of Sun et al. (2017) but after numerous attempts, was unsuccessful even after optimizing cycling conditions using a gradient PCR. As a consequence, new forward and reverse primer pairs were designed for Hydroides dianthus: DavF:5’-GCCTGTATTGATTGGTGGTTTC-3’ and DavR: 5’- AAAGCAACAAAGTTGTCACC-3’. For the PCR master mix, the following reagents were used per reaction: 1 ml gDNA, 1 ml each of forward and reverse primer (5 mM each), 0.5 ml of dNTP (10mM), 0.1 ml of KAPA2G Fast DNA polymerase, 5 ml of 5X KAPA2G Buffer A, and 11.4 ml of deionized water. PCR conditions were as follows: one cycle at 95°C for 5 mins followed by five cycles of 10 secs at 95°C, 15 secs at 58°C, 15 secs at 72°C followed by 35 cycles of 10 secs at 95°C, 15 secs at 55°C, 15 secs at 72°C and finally 1 cycle of 5 mins at 72°C. PCR products were sequenced directly by Azenta LLC (Plainfield, NJ) using both forward and reverse primers and Big Dye Terminator Cycle Sequencing. All sequences obtained were verified using the BLASTn tool on the NCBI database and then translated using the ExPASY online translation tool to determine gene functionality. Despite high quality chromatograms and successful alignment with existing GenBank sequences for H. dianthus, many of the full-length sequences contained frameshifts and or stop codons, despite several resequencing efforts and this prevented their direct submission to the GenBank database. To resolve this, we utilized the NCBI ORF Finder to identify open reading frames (ORFs) within our sequences. The identified ORFs, typically around 153 bp in length, did not contain frameshifts or stop codons and were submitted to GenBank.

 "For transparency and reproducibility, the full-length sequences, including aligned and trimmed data, as well as the raw trace files have been provided here with their respective GenBank accession identifiers. The Python script used to remove the gaps during the alignment process is also provided here."