Data and code from: Using phylogenetic network methods for genomic data exploration and hypothesis generation fails to untangle a confusing history of hybridization in New Zealand cicadas
Data files
Feb 10, 2026 version files 24.61 GB
-
data_snaq_2_mrbayes_output_Kikihia_wMito.zip
11.55 GB
-
data_snaq_2_mrbayes_output_Maoricicada_wMito.zip
11.65 GB
-
data.zip
1.42 GB
-
output.zip
2.41 MB
-
README.md
2.62 KB
Abstract
Rapid species radiations make hybridization among species more likely. Detecting and reconstructing hybridization is, therefore, critical for understanding species relationships in many cases. We explored the relative performance of two phylogenetic network methods, SNaQ, a gene tree-based method, and PhyNEST, a site pattern-based method, in evaluating the plausibility of proposed past hybridization hypotheses. As our study system, we used the New Zealand cicada genera Kikihia and Maoricicada. Previous phylogenomic work on these two species' radiations suggested multiple hybridization events. We generated hypotheses for specific hybridization events based on observed hybrid mating songs and patterns of mito-nuclear discordance. We inferred phylogenetic networks using SNaQ and PhyNEST and a phylogenomic dataset of over 500 nuclear Anchored Hybrid Enrichment genes along with mitochondrial genomes to determine whether the two methods recovered plausible networks with respect to our hypothesized hybridization events. We also examined how the D-statistic performed in identifying our predicted hybridization events. We found that both SNaQ and PhyNEST, along with the D-statistic, recovered an extensive history of reticulate evolution in New Zealand cicadas which broadly matched our predictions. We found differences between networks inferred by the two network programs that may reflect differences in the inference methods or result from using site patterns versus gene trees as input data. Finally, we discuss considerations for users applying these methods to targeted enrichment data and suggest improvements for network method developers.
https://doi.org/10.5061/dryad.x0k6djhwf
Description of the data and file structure
These data were collected and generated as part of phylogenetic network analysis of various species in the New Zealand cicada genera Kikihia and Maoricicada.
Files and variables
File: output.zip
Description: This compressed folder contains output files for all analyses used in this study. The folder contains its own README detailing the included files and folders.
File: data.zip
Description: This compressed folder contains input sequence data as well as data generated by intermediate steps of the TICR pipeline. The folder contains its own README detailing the included files and folders.
File: data_snaq_2_mrbayes_output_Maoricicada_wMito.zip
Description: This compressed folder is an overflow folder too large to include in data.zip. It contains the compressed output files for all Maoricicada nuclear genes and mitochondrial genomes generated by the MrBayes step of the TICR pipeline used to prepare the SNaQ analyses. The output files are sorted into compressed folders based on file extension; the "runs_t.zip" compressed folder contains the sampled gene trees used as input for subsequent steps in the TICR pipeline.
File: data_snaq_2_mrbayes_output_Kikihia_wMito.zip
Description: This compressed folder is an overflow folder too large to include in data.zip. It contains the compressed output files for all Kikihia nuclear genes and mitochondrial genomes generated by the MrBayes step of the TICR pipeline used to prepare the SNaQ analyses. The output files are sorted into compressed folders based on file extension; the "runs_t.zip" compressed folder contains the sampled gene trees used as input for subsequent steps in the TICR pipeline.
Code/software
scripts.zip
Description: This compressed folder contains the data processing and analysis scripts used to generate the intermediate data files and analysis results from the input sequence alignments. The folder contains its own README file with information about each script.
Access information
Data was derived from the following sources:
- Input sequence alignments were obtained and re-formatted from the following Dryad repository: https://datadryad.org/dataset/doi:10.5061/dryad.t1g1jwt7v
