Cophylogeny reconstruction allowing for multiple associations through approximate Bayesian computation
Sinaimeri, Blerina; Urbini, Laura; Sagot, Marie-France; Matias, Catherine (2022), Cophylogeny reconstruction allowing for multiple associations through approximate Bayesian computation, Dryad, Dataset, https://doi.org/10.5061/dryad.5x69p8d6v
Nowadays, the most used method in studies of the coevolution of hosts and symbionts is phylogenetic tree reconciliation. A crucial issue in this method is that from a biological point of view, reasonable cost values for an event-based parsimonious reconciliation are not easily chosen. Different approaches have been developed to infer such cost values for a given pair of host and symbiont trees. However, a major limitation of these approaches is their inability to model the invasion of different host species by the same symbiont species (referred to as a spread event), which is thought to happen in symbiotic relations. To mention one example, the same species of insects may pollinate different species of plants. This results in multiple associations observed between the symbionts and their hosts (meaning that a symbiont is no longer specific to a host), that are not compatible with the current methods of coevolution. In this paper, we propose a method, called AmoCoala (a more realistic version of a previous tool called Coala) which for a given pair of host and symbiont trees, estimates the probabilities of the cophylogeny events, in presence of spread events, relying on an approximate Bayesian computation (ABC) approach. The algorithm that we propose, by including spread events, enables the multiple associations to be taken into account in a more accurate way, inducing more confidence in the estimated sets of costs and thus in the reconciliation of a given pair of host and symbiont trees. Its rooting in the tool Coala allows it to estimate the probabilities of the events even in the case of large datasets. We evaluate our method on synthetic and real datasets.
4 biological datasets were extracted from the literature and provided as nexus files.
AP - Acacia & Pseudomyrmex. This dataset was extracted from Gomez-Acevedo et al. Neotropical mutualism between Acacia and Pseudomyrmex: Phylogeny and divergence times. Molecular Phylogenetics and Evolution, 2010. It displays the interaction between Acacia plants and Pseudomyrmex, a genus of ants. The host and symbiont trees include 9 and 7 leaves, respectively. The dataset has 22 multiple-associations.
MP - Myrmica & Phengaris. This dataset was extracted from Jansen et al. A phylogenetic test of the parasite-host associations between Maculinea butterflies (Lepidoptera: Lycaenidae) and Myrmica ants (Hymenoptera: Formicidae). European Journal of Entomology, 2011. It is composed of a pair of host and symbiont trees which have each 8 leaves. The dataset has 8 multiple-associations.
SBL - Seabirds & Lice. This dataset was extracted from Paterson et al. Host-parasite co-speciation, host switching, and missing the boat. In D. H. Clayton and J. Moore, editors, Host-parasite evolution: General principles and avian models, Oxford University Press, 1997. The host and symbiont trees include 15 and 8 leaves, respectively. The dataset has 15 multiple-associations.
SFC - Smut Fungi & Caryophillaceus plants. This dataset was extracted from Refrégier et al. Cophylogeny of the anther smut fungi and their caryophyllaceous hosts: Prevalence of host shifts and importance of delimiting parasite species for inferring cospeciation. BMC Evolutionary Biology, 2008. The host and symbiont trees include 15 and 16 leaves, respectively. The dataset has 4 multiple-associations.
The documentation for using the java code of AmoCoala can be found here https://sites.google.com/view/blerinasinaimeri/software/amocoala/documentation