Cavefish genomes resolve the ages of North American subterranean ecosystems
Data files
Aug 28, 2025 version files 7.75 GB
-
IQ-TREE_ASTRAL.zip
217.16 MB
-
R_plotting.zip
2.21 KB
-
README.md
4.76 KB
-
SCANS.zip
2.53 GB
-
Time_Calibration.zip
5.01 GB
Abstract
Genomes provide tools for reconstructing organismal evolution and larger Earth-system processes. Here, we reconstruct the genomic evolution of cave-adapted amblyopsid fishes. Although microcomputed tomography reveals the strikingly similar skeletons of cave-adapted lineages, analyses of the genomes of all species suggest that amblyopsids independently colonized caves and degenerated their eyes at least four times after descending from populations that already possessed adaptations to low-light environments. By examining pseudogenization through loss-of-function mutations in amblyopsids, we infer that the genomic bases of their vision degenerated over millions of years. We leverage these data to pinpoint the ages of subterranean karstic ecosystems in eastern North America, which are difficult to date using traditional geochronologic techniques. Our results demonstrate how genomes can be used to reconstruct the timescale of Earth system evolution.
https://doi.org/10.5061/dryad.jwstqjqk9
Description of the data and file structure
CT scans were conducted at the Yale High Resolution Computed Tomography Scanning Facility.
All other data were collected from the literature and published genome + UCE sequences.
Files and variables
File: R_plotting.zip
Description: Csv files for use as input for plotting in R, including ancestral state reconstructions.
- Cavefish_habitat.csv #input csv file for ancestral state reconstructions of habitat and relevant morphological features in amblyopsid cavefishes. Empty cells represent data not available.
- Comparison_of_Ages.csv #input csv file to plot ages of key percopsiform lineages found in this study and previous ones, in millions of years.
File: IQ-TREE_ASTRAL.zip
Description: IQ-TREE and ASTRAL maximum likelihood and multispecies coalescent trees and output log files. All folders are standard output from IQ-TREE and ASTRAL maximum likelihood and multispecies coalescent analyses, respectively. For additional information on these programs, please consult https://iqtree.github.io/doc/Tutorial and https://github.com/smirarab/ASTRAL
- iqtree_concat #output directory from IQ-TREE analysis where all sequences were concatenated and treated as a single partition
- iqtree_gene_trees #output directory from IQ-TREE analysis where all sequences individually analyzed to create gene trees
- Astral #output directory from ASTRAL containing output multispecies coalescent tree from IQ-TREE gene trees
- iqtree_partitioned #output directory from IQ-TREE analysis where all sequences were concatenated and treated as multiple partitions
- iqtree_concordance_factors_2 #output directory from IQ-TREE concordance factor analysis
File: SCANS.zip
Description: CT Scan segmentation files.
- agassizi #CT scan stl and image captures for Forbesichthys agassizi
- Amblyopsis_hoosieri #CT scan stl and image captures for Amblyopsis hoosieri
- Amblyopsis_spelea_V2 #CT scan stl and image captures for Amblyopsis spelaea
- Aphredoderus #CT scan stl and image captures for Aphredoderus sayanus
- Chologaster #CT scan stl and image captures for Chologaster cornuta
- papilliferous #CT scan stl and image captures for Forbesichthys papilliferus
- Percopsis_omiscomycus #CT scan stl and image captures for Percopsis omiscomycus
- Percopsis_transmontana #CT scan stl and image captures for Percopsis transmontana
- Speoplatyrhinus #CT scan stl and image captures for Speoplatyrhinus poulsoni
- Troglichthys #CT scan stl and image captures for Troglichthys rosae
- Typhlichthys_eigenmanni #CT scan stl and image captures for Typhlichthys eigenmanni
- Typhlichthys_subterraneus_lineage_2 #CT scan stl and image captures for Typhlichthys subterraneus Southern Lineage
- Typhlichthys_subterraneus_lineage_type #CT scan stl and image captures for Typhlichthys subterraneus Northern Lineage
- whole_body_renders #Whole Body CT scan renders
File: Time_Calibration.zip
Description: Input and output for time-calibration analyses of both morphological and molecular data in BEAST2, including input xmls and output log and tree files.
- Morph_Bayes #analysis input and output files from the Bayesian tip-dating analysis of the morphological character dataset, including the summary tree (Cavefish_morph_12sum.tree), combined posterior tree set (Cavefish_morph_12sum.trees), input xml (Cavefish_morph.xml), raw nexus file (Murray_morph_data.nex), and tree and log files from both independent runs (run_1_morph, run_2_morph)
- Molec_Bayes #analysis input and output files from the Bayesian tip- and node-dating analyses of the UCE sequence datasets, including the tip-dating (Tip_DATE) input xml (xml) and sequence (set1_subsample_tipdate, set2_subsample_tipdate, set3_subsample_tipdate) files, output log and tree files for each run (run1, run2, run3) posterior tree set (UCE_SUM_123.trees), and summary tree files (TIP_DATE_123SUM.tre) and the node-dating (NODE_DATE) input xml (xml) and sequence (set1_subsample_nodedate, set2_subsample_nodedate, set3_subsample_nodedate) files, output log and tree files for each run (run1cave, run2cave, run3cave) posterior tree set (Cave_Node_SUM123.trees), and summary tree files (CAVE_NODE_123SUM.tre)
Code/software
All code is included in the xmls or R file.
Access information
Other publicly accessible locations of the data:
- NCBI Genbank
Data was derived from the following sources:
- NCBI Genbank
- Brownstein, Chase D; Policarpo, Maxime; Harrington, Richard C et al. (2025). Convergent Evolution in Amblyopsid Cavefishes and the Age of Eastern North American Subterranean Ecosystems. Molecular Biology and Evolution. https://doi.org/10.1093/molbev/msaf185
