Skip to main content

7848 alignments with Gblocks of cetaceans

Cite this dataset

Sun, Di (2021). 7848 alignments with Gblocks of cetaceans [Dataset]. Dryad.


Cetaceans (whales, dolphins, and porpoises) have undergone a radical transformation from the typical terrestrial mammalian body plan to a streamlined one while exhibited dramatic inter-specific size ranges. However, the molecular mechanisms underlying the diversifying evolution of cetacean body size are largely unknown. Here, by using genome and phenotypic data from 22 cetaceans, we seek to investigate the genome-wide gene-phenotype correlation and to explore the genetic basis under the high diversity of body size in cetaceans. Results of the functional enrichment showed that body size-related genes in cetaceans were enriched in pathways associated with immunity, cell growth, and metabolism, suggesting their potential roles in the diversifying evolution of body size in cetaceans. A series of genes was also found coevolution with body size that are mainly involved in immune surveillance, tumor suppression function, and development of ‘cheater’ tumors. This in turn suggests that the genes play a role in tumor control and thus resolve Peto’s paradox, a finding that the expansion in body size and thereby cell number does not correlate with increases in cancer incidence in larger whales. The present study could provide novel insights into the evolution of great body size variation in cetaceans.


All the gene sets were aligned at the codon level using the Prank program (Löytynoja 2014) with the option ‘-codon.’ All alignments have been deposited at Dryad. After the alignments were generated, the Gblocks program (Castresana 2000) was used to trim potentially unreliable and gap regions. The parameters used were relatively strict to obtain as many bases as possible with the sequence type being codon (‘-t = c -b1=5 -b2=6 -b3=8 -b4=5 -b5’). The *htm file showed the trim detail for each gene.