Skip to main content
Dryad

Cophylogeny reconstruction allowing for multiple associations through approximate Bayesian computation

Cite this dataset

Sinaimeri, Blerina; Urbini, Laura; Sagot, Marie-France; Matias, Catherine (2023). Cophylogeny reconstruction allowing for multiple associations through approximate Bayesian computation [Dataset]. Dryad. https://doi.org/10.5061/dryad.5x69p8d6v

Abstract

Phylogenetic tree reconciliation is extensively employed for the examination of coevolution between host and symbiont species. An important concern is the requirement for dependable cost values when selecting event-based parsimonious reconciliation. Although certain approaches deduce event probabilities unique to each pair of host and symbiont trees, which can subsequently be converted into cost values, a significant limitation lies in their inability to model the invasion of diverse host species by the same symbiont species (termed as a spread event), which is believed to occur in symbiotic relationships. Invasions lead to the observation of multiple associations between symbionts and their hosts (indicating that a symbiont is no longer exclusive to a single host), which are incompatible with the existing methods of coevolution. 

Here, we present a method called AmoCoala (an enhanced version of the tool Coala) that provides a more realistic estimation of cophylogeny event probabilities for a given pair of host and symbiont trees, even in the presence of spread events. We expand the classical 4-event coevolutionary model to include 2 additional spread events (vertical and horizontal spreads) that lead to multiple associations. In the initial step, we estimate the probabilities of spread events using heuristic frequencies. Subsequently, in the second step, we employ an approximate Bayesian computation (ABC) approach to infer the probabilities of the remaining 4 classical events (cospeciation, duplication, host switch, and loss) based on these values.

By incorporating spread events, our reconciliation model enables a more accurate consideration of multiple associations. This improvement enhances the precision of estimated cost sets, paving the way to a more reliable reconciliation of host and symbiont trees. To validate our method, we conducted experiments on synthetic datasets and demonstrated its efficacy using real-world examples. Our results showcase that AmoCoala produces biologically plausible reconciliation scenarios, further emphasizing its effectiveness.The software is accessible at https://github.com/sinaimeri/AmoCoala.

Methods

4 biological datasets were extracted from the literature and provided as nexus files.

AP - Acacia & Pseudomyrmex. This dataset was extracted from Gomez-Acevedo et al. Neotropical mutualism between Acacia and Pseudomyrmex: Phylogeny and divergence times. Molecular Phylogenetics and Evolution, 2010. It displays the interaction between Acacia plants and Pseudomyrmex, a genus of ants. The host and symbiont trees include 9 and 7 leaves, respectively. The dataset has 22 multiple-associations.

MP - Myrmica & Phengaris. This dataset was extracted from Jansen et al. A phylogenetic test of the parasite-host associations between Maculinea butterflies (Lepidoptera: Lycaenidae) and Myrmica ants (Hymenoptera: Formicidae). European Journal of Entomology, 2011. It is composed of a pair of host and symbiont trees which have each 8 leaves. The dataset has 8 multiple-associations.

SBL - Seabirds & Lice. This dataset was extracted from Paterson et al. Host-parasite co-speciation, host switching, and missing the boat. In D. H. Clayton and J. Moore, editors, Host-parasite evolution: General principles and avian models, Oxford University Press, 1997. The host and symbiont trees include 15 and 8 leaves, respectively. The dataset has 15 multiple-associations.

SFC - Smut Fungi & Caryophillaceus plants. This dataset was extracted from Refrégier et al. Cophylogeny of the anther smut fungi and their caryophyllaceous hosts: Prevalence of host shifts and importance of delimiting parasite species for inferring cospeciation. BMC Evolutionary Biology, 2008. The host and symbiont trees include 15 and 16 leaves, respectively. The dataset has 4 multiple-associations.

Usage notes

The java code can be found at https://github.com/sinaimeri/AmoCoala

Funding