Skip to main content

Data from: HyDe: a Python package for genome-scale hybridization detection

Cite this dataset

Blischak, Paul D.; Chifman, Julia; Wolfe, Andrea D.; Kubatko, Laura S. (2018). Data from: HyDe: a Python package for genome-scale hybridization detection [Dataset]. Dryad.


The analysis of hybridization and gene flow among closely related taxa is a common goal for researchers studying speciation and phylogeography. Many methods for hybridization detection use simple site pattern frequencies from observed genomic data and compare them to null models that predict an absence of gene flow. The theory underlying the detection of hybridization using these site pattern probabilities exploits the relationship between the coalescent process for gene trees within population trees and the process of mutation along the branches of the gene trees. For certain models, site patterns are predicted to occur in equal frequency (i.e., their difference is 0), producing a set of functions called textit{phylogenetic invariants}. In this paper we introduce HyDe, a software package for detecting hybridization using phylogenetic invariants arising under the coalescent model with hybridization. HyDe is written in Python, and can be used interactively or through the command line using pre-packaged scripts. We demonstrate the use of HyDe on simulated data, as well as on two empirical data sets from the literature. We focus in particular on identifying individual hybrids within population samples and on distinguishing between hybrid speciation and gene flow. HyDe is freely available as an open source Python package under the GNU GPL v3 on both GitHub ( and the Python Package Index (PyPI:

Usage notes


National Science Foundation, Award: DEB-1455399, DMS-1106706