Skip to main content

Dataset for: Conservation genomics of federally endangered Texella harvester species (Arachnida, Opiliones, Phalangodidae) from cave and karst habitats of central Texas

Cite this dataset

Hedin, Marshal; Derkarabetian, Shahan; Reddell, James; Paquin, Pierre (2021). Dataset for: Conservation genomics of federally endangered Texella harvester species (Arachnida, Opiliones, Phalangodidae) from cave and karst habitats of central Texas [Dataset]. Dryad.


Genomic-scale data for non-model taxa are providing new insights into landscape genomic structuring and species limits, leading to more informed conservation decisions, particularly in taxa with extremely restricted microhabitat preferences and small geographic distributions. This study applied sequence capture of ultraconserved elements (UCEs) to gather genomic-scale data for two federally endangered Texella harvester species distributed in Edwards Formation cave and karst habitats of central Texas, near Austin. We gathered UCE data for 51 T. reyesi specimens from 46 different caves, seven T. reddelli specimens from five caves, and from relevant outgroup species. For these UCE data we applied a combination of phylogenomic, multispecies coalescent phylogenetic, and single-nucleotide polymorphism (SNP) machine-learning analyses. We found that samples of T. reddelli and T. reyesi together form a single clade in phylogenetic analyses, but that T. reddelli samples are not recovered as monophyletic. Instead, T. reddelli samples from three northern caves are embedded within a larger T. reyesi genetic clade. Significantly, the genetic structuring of all samples closely follows geologic barriers defined for the region and formalized as karst fauna regions (KFRs). One exception is the Jollyville Plateau KFR, which includes two divergent, non-sister genetic lineages. Levels of troglomorphy, here assessed by a simple scoring of corneal and retinal development, also closely follows clade (and geographic) boundaries, implying that divergent genetic lineages might also have distinct ecologies. Overall, our study has important taxonomic implications, is the first to explore (and validate) regional KFR boundaries using intraspecific genetic data, and provides essential data for future management decisions involving these federally endangered species.


We used the Arachnida 1.1Kv1 probe set (Arbor Biosciences) to capture UCE loci, using standard methods of library preparation. Sequence reads were trimmed and assembled using TRINITY version 2.1.1. (Grabherr et al. 2011) using default settings (trimmomatic = full_cleanup, kmer = 25), then processed using Phyluce (Faircloth 2016). Assembled contigs were matched to probes using liberal minimum coverage and minimum identity values (both 65). UCE loci were aligned with MAFFT (Katoh and Standley 2013) and trimmed with Gblocks (Castresana 2000, Talavera and Castresana 2007) using relaxed settings (--b1 0.5 --b2 0.5 --b3 10 --b4 4). Resulting Phyluce alignments with at least 50% sample occupancy were imported into Geneious 11.0.4 (Biomatters), where ragged ends from individual sequences were trimmed from alignments (typical of “standard” museum specimens with degraded DNA).

Maximum likelihood phylogenetic analyses were conducted using IQ-TREE version 2.0-rc2. Initial partitions corresponded to individual loci, then ModelFinder (Kalyaanamoorthy et al. 2017) was used to find best-fit models and merge partitions (-s -p -m TESTMERGE -rcluster 10); the relaxed hierarchical clustering algorithm (Lanfear et al. 2014) was used to reduce computational burden. Support was assessed via 1000 ultrafast bootstrap replicates (Hoang et al. 2018). An SVDQuartets analysis (Chifman and Kubatko 2014, 2015) was conducted on a concatenated matrix using PAUP* 4.0a (Swofford 2002), implementing the multispecies coalescent tree model with exhaustive quartets sampling and 1000 bootstrap replicates.


United States Fish and Wildlife Service, Award: Sponsor ID F18AC00913