Skip to main content
Dryad

Phylogenetics identifies two eumetazoan TRPM clades and an eighth TRP family, TRP soromelastatin (TRPS)

Cite this dataset

Himmel, Nathaniel J; Gray, Thomas R; Cox, Daniel N (2020). Phylogenetics identifies two eumetazoan TRPM clades and an eighth TRP family, TRP soromelastatin (TRPS) [Dataset]. Dryad. https://doi.org/10.5061/dryad.kwh70rz03

Abstract

Transient receptor potential melastatins (TRPMs) are most well known as cold and menthol sensors, but are in fact broadly critical for life, from ion homeostasis to reproduction. Yet, the evolutionary relationship between TRPM channels remains largely unresolved, particularly with respect to the placement of several highly divergent members. To characterize the evolution of TRPM and like channels, we performed a large-scale phylogenetic analysis of >1,300 TRPM-like sequences from 14 phyla (Annelida, Arthropoda, Brachiopoda, Chordata, Cnidaria, Echinodermata, Hemichordata, Mollusca, Nematoda, Nemertea, Phoronida, Priapulida, Tardigrada, and Xenacoelomorpha), including sequences from a variety of recently sequenced genomes that fill what would otherwise be substantial taxonomic gaps. These findings suggest: 1) the previously recognized TRPM family is in fact two distinct families, including canonical TRPM channels and an eighth major previously undescribed family of animal TRP channel, TRP soromelastatin; 2) two TRPM clades predate the last bilaterian–cnidarian ancestor; and 3) the vertebrate–centric trend of categorizing TRPM channels as 1–8 is inappropriate for most phyla, including other chordates. 

Methods

Data Collection & Curation

Starting with previously characterized TRPM sequences from human (NCBI CCDS), mouse (NCBI CCDS), Drosophila melanogaster (FlyBase), and Caenorhabditis elegans (WormBase), a TRPM-like protein sequence database was assembled by performing BLASTp against NCBI collections of non-redundant protein sequences, with D. melanogaster Trpm (isoform RE, FlyBase ID: FBtr0339077) serving as the bait sequence.  Only BLAST hits >300 amino acids in length with an E-value less than 1E-30 were retained.  As we were interested in the origins of TRPM channels, and in less-studied taxa, only three tetrapod sequence-sets were included, from human, mouse, and chicken.

In order to expand the taxa sampled, tBLASTn and BLASTp were used to search genomically-informed gene models for 11 cnidarians, 2 xenacoelomorphs, 1 hemichordate, 1 nemertean, 1 phoronid, 2 agnathans, and 4 chondrichthyes (Table S1).

We used several methods in order to validate and improve the quality of the initial database.  First, CD-HIT (threshold 90% similarity) was used to identify and remove duplicate sequences and predicted isoforms, retaining the longest isoform (27-29).  Phobius was then used to predict transmembrane topology (30, 31); sequences which did not have at least 6 predicted transmembrane (TM) segments were removed.  Sequences with more than the 6 predicted TM segments were analyzed via InterProScan (32), and those with more than 1 ion-transport domain were removed.  More than 90% of the remaining sequences contained a highly conserved glycine residue in the predicted TM domain (corresponding to D. melanogaster G-1049); the vast majority of those missing this residue had large gaps in an initial alignment and were subsequently removed.

Searches for TRPS (ced-11-like), TRPN, and TRPC sequences followed the same protocol.  For TRPS, sequences from Caenorhabditis elegans, Strigamia maritima, and Octopus vulgaris were used as bait.  For TRPN and TRPC datasets, Drosophila melanogaster nompC (isoform PA, FlyBase ID: FBpp0084879) and Trp (isoform PA, Flybase ID: FBpp0084879) served as bait sequences, respectively.

Funding

National Institute of Neurological Disorders and Stroke, Award: R01NS115209

National Institute of General Medical Sciences, Award: R25GM109442-01A1