Data from: dispersal sweepstakes: biotic interchange propelled air-breathing fishes across the globe
Data files
Jan 10, 2024 version files 270.69 MB
-
BEAST_divtimes_set1.tre
323.06 KB
-
BEAST_divtimes_set1.xml
7.84 MB
-
BEAST_divtimes_set2.tre
322.01 KB
-
BEAST_divtimes_set2.xml
7.56 MB
-
BEAST_divtimes_set3.tre
322.32 KB
-
BEAST_divtimes_set3.xml
8.18 MB
-
README.md
2.23 KB
-
UCE_composite_matrix_IQTree.tre
15.79 KB
-
UCE_composite_matrix_partitions.txt
47.63 KB
-
UCE_composite_matrix.phylip
132.04 MB
-
UCE_loci_75p_complete.gzip
8.36 MB
-
UCE_loci_trees_for_ASTRAL.trees
8.22 MB
-
UCE_only_75percent_matrix_partitions.txt
16.04 KB
-
UCE_only_75percent_matrix.phylip
97.43 MB
-
UCE_only_matrix_IQTree.tre
8.68 KB
Abstract
Synbranchiformes is a phenotypically diverse and species rich clade of freshwater acanthomorph fishes, which include eel- and perch-like, air-breathing and non-air-breathing fishes. The ability to breathe out of water has presumably aided lineages of Synbranchiformes in dispersing across all southern continents except Antarctica. The lack of a well-resolved, time-calibrated phylogeny of Synbranchiformes limits our understanding of the timing and geographic patterns of diversification of these anatomically and ecologically diverse fishes. As a consequence, contemporary interpretations of synbranchiform biogeography invoke scenarios as disparate as Gondwana vicariance and pan-global rafting to explain their modern-day geographic distribution. In this study, we use high-throughput sequencing of ultra-conserved elements (UCEs) to infer a phylogeny for all major synbranchiform lineages. We combine this dataset with existing Sanger sequenced genes and fossil calibrations to infer a comprehensive time-calibrated phylogeny of Synbranchiformes. Then, we use Bayesian methods of biogeographical reconstruction to document the history of dispersal of synbranchiforms, finding support for Southeast Asia as the likely ancestral area of all major lineages. Our results reject the hypothesis of Gondwanan vicariance explaining synbranchiform biogeography, and instead the historical biogeographic analyses support a hypothesis of independent continental invasions by snakeheads, anabantids, and spiny eels. However, there is no signal of elevated lineage diversification rates after these invasions. Instead, higher rates of lineage diversification in spiny eels pre-dates their arrival to Africa, while the high levels of lineage diversification observed in Betta were initiated prior to the flooding of insular Sundaland in SE Asia.
Included are concatenated sequence data files, files that denote the partitions of UCE loci, and the output tree files from analyses using IQTree and BEAST.
Description of the Data and file structure
UCE_loci_75p_complete.gzip > gzip directory of nexus-formatted alignments for individual UCE loci. All alignments in this directory contain at least 75% of taxa in our total dataset.
UCE_only_matrix_IQTree.tre > The output tree from IQTree analysis of concatenated UCE_only sequence data matrix.
UCE_only_75percent_matrix.phylip > The alignment for UCE sequence data (all loci 75% taxonomically complete), concatenated in phylip format.
UCE_only_75percent_matrix_partitions.txt > The partitioning scheme for the IQTree analysis of concatenated UCE sequence data (all loci 75% taxonomically complete).
UCE_composite_matrix_IQTree.tre > The output tree from IQTree analysis of the concatenated UCE-composite (95% complete UCE loci plus the data obtained from Genbank) matrix.
UCE_composite_matrix.phylip > The alignment for UCE sequence data (UCE-genbank-composite matrix), concatenated in phylip format.
UCE_composite_matrix_partitions.txt > The partitioning scheme for the IQTree analysis of concatenated UCE-genbank-composite sequence data.
UCE_loci_trees_for_ASTRAL.trees > Individual UCE loci trees obtained by maximum likelihood analysis in IQTree.
BEAST_divtimes_set1.xml > XML file for BEAST divergence time analysis of 30 UCE loci (set 1) and three Genbank loci (cytb, coi, rag1).
BEAST_divtimes_set2.xml > XML file for BEAST divergence time analysis of 30 UCE loci (set 2) and three Genbank loci (cytb, coi, rag1).
BEAST_divtimes_set3.xml > XML file for BEAST divergence time analysis of 30 UCE loci (set 3) and three Genbank loci (cytb, coi, rag1).
BEAST_divtimes_set1.tre > Time tree obtained from BEAST analysis of 30 UCE loci (set 1) and three Genbank loci (cytb, coi, rag1)
BEAST_divtimes_set2.tre > Time tree obtained from BEAST analysis of 30 UCE loci (set 2) and three Genbank loci (cytb, coi, rag1)
BEAST_divtimes_set3.tre > Time tree obtained from BEAST analysis of 30 UCE loci (set 3) and three Genbank loci (cytb, coi, rag1)
UCE DNA sequence data were obtained through hybrid enrichment of genomic libraries. Library preparation of sheared, genomic DNA samples were performed using Kapa HyperPrep kits; UCE probes were purchased from Daicel Arbor Biosciences to target ~1300 ultraconserved elements from Acanthomorph fishes. Sequencing was performed on Illumina HiSeq4000. Demultiplexed data were processed using the Phyluce software package, which applies other programs to: trim tag sequences from raw reads and remove low quality sequence data; match raw reads to UCE probes; and construct alignments to meet various thresholds of completeness (75% or 95% taxonically complete). Trees were inferred using the program IQTree; divergence times were estimated using BEAST; lineage diversification dynamics were estimated using BAMM and MiSSE; biogeographic history was inferred using BioGeoBears.
Alignment files and their associated character set/partition delimitation files can be opened with any text editor. Tree files can be opened with FigTree or other program that reads newick- or nexus- formatted phylogenetic files.