Data from: Phylogenomics of North American cybaeid spiders (Araneae, F. Cybaeidae), including the description of new taxa from the Klamath Mountains Geomorphic Province
Data files
Jan 17, 2025 version files 12.90 MB
-
18S_MH.nex
59.48 KB
-
28S_MH.nex
66.94 KB
-
Astral_50pFiltered.log
17.14 KB
-
Astral_80pFiltered.log
22.45 KB
-
concatenated_partitions.txt
59 B
-
concatenated.nex
151.95 KB
-
Cybaeidae_IQTREE_50pFiltered.gcf.tree
2.32 KB
-
Cybaeidae_IQTREE_50pFiltered.scf.tree
2.33 KB
-
Cybaeidae_IQTREE_50pFiltered.tre
2.15 KB
-
Cybaeidae_IQTREE_80pFiltered.gcf.tree
2.32 KB
-
Cybaeidae_IQTREE_80pFiltered.scf.tree
2.32 KB
-
Cybaeidae_IQTREE_80pFiltered.tre
2.14 KB
-
Cybaeidae_IQTREE_PhyIN50p.tre
2.15 KB
-
Cybaeidae_IQTREE_PhyIN80p.tre
2.15 KB
-
Cybaeidae_jAstral-50pFiltered.tre
1.67 KB
-
Cybaeidae_jAstral-80pFiltered.tre
1.67 KB
-
Cybaeidae_Sanger_18s.tre
1.71 KB
-
Cybaeidae_Sanger_28s.tre
1.62 KB
-
Cybaeidae_Sanger_H3.tre
1.92 KB
-
Cybaeidae_Sanger_zConcat.tre
2.04 KB
-
Cybaeidae50p_filtered_Archive.zip
1.60 MB
-
Cybaeidae50p_PhyIN_Archive.zip
2.02 MB
-
Cybaeidae80p_filtered_Archive.zip
763.88 KB
-
Cybaeidae80p_PhyIN_Archive.zip
896.26 KB
-
H3_MH.nex
36.54 KB
-
IQTree_50p_PhyIN.log
2.48 MB
-
IQTree_50pFiltered.log
2.24 MB
-
IQTree_80p_PhyIN.log
1.13 MB
-
IQTree_80pFiltered.log
1.20 MB
-
IQTree_Sanger_concat.log
17.89 KB
-
IQTree_Sanger_H3.nex.log
21.53 KB
-
IQTree_Sanger18S.nex.log
16.72 KB
-
IQTree_Sanger28S.fasta.log
129.51 KB
-
README.md
4.35 KB
Abstract
The systematics of humble-in-appearance brown spiders (“marronoids”), within a larger group of spiders with a modified retrolateral tibial apophysis (the RTA Clade), has long vexed arachnologists. Although not yet fully settled, recent phylogenomics has allowed the delimitation and phylogenetic relationships of families within marronoids to come into focus. Understanding relationships within these families still awaits more comprehensive generic-level sampling, as the bulk of described marronoid genera remain unsampled for phylogenomic data. Here we conduct such an analysis in the family Cybaeidae Banks, 1892. We greatly increase generic-level sampling, assembling ultraconserved element (UCE) data for 18 of 22 described cybaeid genera, including all North American genera, and rigorously test family monophyly using a comprehensive outgroup taxon sample. We also conduct analyses of traditional Sanger loci, allowing curation of some previously published data. Our UCE phylogenomic results support the monophyly of recognized cybaeids, with strongly supported internal relationships, and evidence for five primary molecular subclades. We hypothesize potential morphological synapomorphies for several of these subclades, bringing a robust phylogenomic underpinning to cybaeid classification. We discover and describe a new cybaeid genus (Siskiyu gen. nov.) and species (Siskiyu armilla sp. nov.) from far northern California and adjacent southern Oregon and describe a new species in the elusive genus Cybaeozyga (C. furtiva sp. nov.) from far northern California.
README: Phylogenomics of North American cybaeid spiders (Araneae, F. Cybaeidae), including the description of new taxa from the Klamath Mountains Geomorphic Province
https://doi.org/10.5061/dryad.2v6wwpzz4
Description of the data and file structure
Files and variables
File: Cybaeidae80p_PhyIN_Archive.zip
Description: individual locus alignments for 80p_PhyIn matrices
File: concatenated_partitions.txt
Description: partitions file for concatenated Sanger matrix
Variables
- DNA:
- 18s = 1-1554:
File: Cybaeidae50p_filtered_Archive.zip
Description: individual locus alignments for 50p_Filtered matrices
File: Cybaeidae50p_PhyIN_Archive.zip
Description: individual locus alignments for 50p_PhyIn matrices
File: H3_MH.nex
Description: H3 Sanger data matrix
File: Astral_80pFiltered.log
Description: phylogenetic output log files from Astral_80p analysis
File: concatenated.nex
Description: concatenated Sanger matrix
File: Astral_50pFiltered.log
Description: phylogenetic output log files from Astral_80p analysis
File: Cybaeidae80p_filtered_Archive.zip
Description: individual locus alignments for 80p_Filtered matrices
File: IQTree_Sanger18S.nex.log
Description: phylogenetic output log files from Sanger 18S matrix analysis
File: 28S_MH.nex
Description: 28S Sanger matrix
File: 18S_MH.nex
Description: 18S Sanger matrix
File: IQTree_Sanger_H3.nex.log
Description: phylogenetic output log files from Sanger H3 matrix analysis
File: IQTree_80p_PhyIN.log
Description: phylogenetic output log files from 80p_PhyIn matrix analysis
File: Cybaeidae_jAstral-50pFiltered.tre
Description: .tre file from Astral_50pFiltered analysis
File: Cybaeidae_Sanger_H3.tre
Description: .tre file from Sanger H3 matrix analysis
File: Cybaeidae_Sanger_zConcat.tre
Description: .tre file from Sanger Concatenated matrix analysis
File: Cybaeidae_Sanger_18s.tre
Description: .tre file from Sanger 18S matrix analysis
File: Cybaeidae_Sanger_28s.tre
Description: .tre file from Sanger 28S matrix analysis
File: IQTree_80pFiltered.log
Description: phylogenetic output log files from 80p_Filtered matrix analysis
File: IQTree_Sanger28S.fasta.log
Description: phylogenetic output log files from Sanger 28S matrix analysis
File: Cybaeidae_IQTREE_80pFiltered.tre
Description: .tre file from 80pFiltered analysis
File: Cybaeidae_IQTREE_PhyIN50p.tre
Description: .tre file from 50pPhyIn analysis
File: Cybaeidae_IQTREE_PhyIN80p.tre
Description: .tre file from 80pPhyIn analysis
File: IQTree_50p_PhyIN.log
Description: phylogenetic output log files from 50p_PhyIn matrix analysis
File: Cybaeidae_IQTREE_50pFiltered.tre
Description: .tre file from 50pFiltered analysis
File: Cybaeidae_IQTREE_80pFiltered.scf.tree
Description: .tre file from 80pFiltered site concordance factor analysis
File: Cybaeidae_IQTREE_50pFiltered.scf.tree
Description: .tre file from 50pFiltered site concordance factor analysis
File: Cybaeidae_IQTREE_50pFiltered.gcf.tree
Description: .tre file from 50pFiltered gene concordance factor analysis
File: Cybaeidae_IQTREE_80pFiltered.gcf.tree
Description: .tre file from 80pFiltered gene concordance factor analysis
File: Cybaeidae_jAstral-80pFiltered.tre
Description: .tre file from Astral_80pFiltered analysis
File: IQTree_Sanger_concat.log
Description: phylogenetic output log files from Sanger loci concatenated matrix analysis
File: IQTree_50pFiltered.log
Description: phylogenetic output log files from 50pFiltered matrix analysis
Code/software
Matrix and log files can be viewed using a basic text editor
.tre files can be viewed with a text editor, or FigTree
Sanger loci were harvested from UCEs, low-coverage genomes and transcriptome data using the “loci_byCatch.sh” custom script. This was also based on the reference fasta file (“reference_cyb.fasta”) containing target loci, created from previously published ingroup data.
Methods
Input Matrices:
50p_filtered and 80p_filtered matrices - UCE loci were aligned, trimmed, and filtering using FUSe. This included aligning with the MAFFT globapair option, and trimming using trimAl with the -automated1 option. We removed highly divergent sequences (60%) (--remove-div -d 0.6) and sequences shorter than 70% of the total alignment length (--remove-short -s 0.7), retaining alignments with a minimum of 4 sequences and longer than 50bp. Subsequent 50% and 80% completeness matrices were created (--get-completeness -e 0.8 and -e 0.5). We manually curated the above alignments in Geneious Prime 2023, removing any remaining obviously divergent individual sequences, and matrices with an average pairwise identity below 80%.
50p_PhyIN and 80p_PhyIN matrices – Here we combined PhyIN and FUSe. This workflow consisted of aligning with MAFFT using the globalpair option, followed by removal of highly divergent sequences (70% divergence) (--remove-div -d 0.7), trimming of gaps using a Simple Gappiness Filter (sgp.py) with options -gS -gB -t -1 in combination with PhyIN using options -b 10 -d 2 -p 0.5 -e. After trimming, remaining short sequences were removed using FUSe (--remove-short -s 0.7) and 50 and 80% completeness matrices were created.
Sanger matrices - Sanger loci were harvested from UCEs, low-coverage genomes and transcriptome data using custom scripts (“loci_byCatch.sh”). A reference fasta file (“reference_cyb.fasta”) containing target loci was created from previously published ingroup data. Cleaned fastq files were mapped against this reference using BWA. Resulting BAM files were used for calling consensus sequences for each sample and locus using the consensus function of SAMTOOLS v1.16. For each mapped sample and locus we retained the longest sequence for downstream analysis. Consensus sequences were merged, aligned and trimmed with FUSe using the MAFFT globalpair option for aligning and trimAl -automated1 option for trimming, followed by alignment inspection in Geneious Prime 2023.
Phylogenetic Analyses:
IQ–TREE Analyses - For all UCE concatenated, Sanger concat and individual Sanger loci matrices we conducted maximum likelihood analyses with IQ–TREE 2. Concatenated analyses included individual loci as separate partitions, with 1000 replicates of ultrafast bootstrapping and optimal model search using ModelFinder.
Concordance Factor Analyses - For 50p_filtered and 80p_filtered matrices, we estimated gene trees from individual UCE alignments and calculated gene (gCF) and site (sCF) concordance factors using IQTree; sCF values were estimated with the likelihood option --scfl.
ASTRAL Analyses - For 50p_filtered and 80p_filtered matrices, species trees were estimated under a multispecies coalescent model using weighted ASTRAL (wASTRAL). Input gene trees for wASTRAL were estimated using IQ–TREE 2 with 1000 replicates of ultrafast bootstrapping and treated as unrooted. We used the wASTRAL hybrid scheme to weight gene trees using both long terminal branches and weakly-supported nodes. Internal ASTRAL branch lengths were estimated in coalescent units, with branch support measured as local posterior probability values.
References for all Methods used above can be found in the main manuscript text.
Taxon Labels:
Identification labels for two specimens found in input matrices, output files, and output .tre files have been changed, as follows:
SDSU_G4110_Willisus_sp_nov = SDSU_G4110_Willisus_gertschi
SDSU_G1150_Cryphoecine_nov_gen = SDSU_G1150_ Cybaeina dixoni