Skip to main content

Data from: SNP discovery and gene annotation in the surf clam Mesodesma donacium

Cite this dataset

Gallardo-Escárate, Cristian; Valenzuela-Muñoz, Valentina; Núñez-Acuña, Gustavo; Haye, Pilar (2014). Data from: SNP discovery and gene annotation in the surf clam Mesodesma donacium [Dataset]. Dryad.


The main objective of this research was to identify single-nucleotide polymorphisms (SNPs) from an Expressed Sequences Tags (EST) data set generated by 454 pyrosequencing in the soft clam Mesodesma donacium. A total of 180 159 ESTs were yielded from a M. donacium cDNA library. De novo assembly was performed using stringent calling parameters, producing 10 178 contigs and 41 765 singletons. Here, a total of 2594 SNPs were discovered related to 613 consensus sequences, achieving a frequency of 1 SNPs per 260 bp. SNP variants showed that A/G, A/T and C/T were the most abundant among the identified polymorphisms. We validated a total of 12 SNPs loci by HRMA for annotated genes such as heat shock protein-70 and the translation elongation factor 1-alpha. The Gene Ontology analysis regarding molecular function level revealed that sequences with SNPs were mainly classified to protein and nucleotide binding, as well hydrolase activity, ion binding and oxidoreductase activity. Further, biological processes like cellular and metabolic process, biogenesis, localization and biological regulation were highly annotated. The most expressed genes were related to the mitochondrial electron transport chain, senescence-associated protein, ubiquitin and actin. Interestingly, some relevant genes related to immune response and biomineralization showed a high abundance, such as tumor necrosis factor (TNF)-alpha-receptor-like protein, serine protease inhibitor, heat shock protein, aragonite-binding protein and ferritin. This study contributes to relevant genes associated with functional polymorphisms and gives an overview for future genetic investigations.

Usage notes


29°54’S - 71°13’W