Data from: Predicting function from sequence in a large multifunctional toxin family
Cite this dataset
Malhotra, Anita et al. (2013). Data from: Predicting function from sequence in a large multifunctional toxin family [Dataset]. Dryad. https://doi.org/10.5061/dryad.16pg7
Venoms contain active substances with highly specific physiological effects and are increasingly being used as sources of novel diagnostic, research and treatment tools for human disease. Experimental characterisation of individual toxin activities is a severe rate-limiting step in the discovery process, and in-silico tools which allow function to be predicted from sequence information are essential. Toxins are typically members of large multifunctional families of structurally similar proteins that can have different biological activities, and minor sequence divergence can have significant consequences. Thus, existing predictive tools tend to have low accuracy. We investigated a classification model based on physico-chemical attributes that can easily be calculated from amino-acid sequences, using over 250 (mostly novel) viperid phospholipase A2 toxins. We also clustered proteins by sequence profiles, and carried out in-vitro tests for four major activities on a selection of isolated novel toxins, or crude venoms known to contain them. The majority of detected activities were consistent with predictions, in contrast to poor performance of a number of tested existing predictive methods. Our results provide a framework for comparison of active sites among different functional sub-groups of toxins that will allow a more targeted approach for identification of potential drug leads in the future.