Systematics and biogeography of the Holarctic dragonfly genus Somatochlora (Anisoptera: Corduliidae)
Data files
Jan 17, 2025 version files 1.16 GB
-
README.md
39.77 KB
-
Supplemental_Information.zip
1.16 GB
Abstract
The Striped Emeralds (Somatochlora) are a Holartic group of medium-sized metallic green dragonflies that mainly inhabit bogs and seepages, alpine streams, lakes, channels and lowland brooks. With 42 species they are the most diverse genus within Corduliidae (Odonata: Anisoptera). Systematic, taxonomic, and biogeographic resolution within Somatochlora remains unclear, with numerous hypotheses of relatedness based on wing veins, male claspers (epiproct and paraprocts), and nymphs. Furthermore, Somatochlora borisi was recently described as a new genus (Corduliochlora) based on 17 morphological characters, but its position with respect to Somatochlora is unclear. We present a phylogenetic reconstruction of Somatochlora using Anchored Hybrid Enrichment (AHE) sequences of 40/42 Somatochlora species (including C. borisi). Our data recovers the monophyly of Somatochlora, with C. borisi recovered as sister to the remaining Somatochlora. We also recover three highly supported clades and one of mixed support; this lack of resolution is most likely due to incomplete lineage sorting, third codon position saturation based on iterative analyses run on variations of our dataset, and hybridization. Furthermore, we constructed a dataset for all species based on 20 morphological characters from the literature which were used to evaluate phylogenetic groups recovered with molecular data; the data support the validity of Corduliochlora as a genus distinct from Somatochlora. Finally, divergence time estimation and biogeographic analysis indicate Somatochlora originated in the Western North Hemisphere during the Miocene, with three dispersal events to the Eastern North Hemisphere (11, 7, and 5 Ma respectively) across the Beringian Land Bridge.
README: Systematics and biogeography of the Holarctic dragonfly genus Somatochlora (Anisoptera: Corduliidae): Open Source Data
We have submitted:
Supplemental_Table_S1.csv: Specimen provenance data including locality, date, author, collector, determiner, and GenBank accession code of species used in Phylogeny
Supplemental_Table_S2.csv: Scored wing venational characters between S. brisaci, Somatochlora, Antipodochlora braueri, and Guadalca insularis taken from diagnostic characters of Somatochlora brisaci from Nel et al. 1996
1_Phylogenetic_Analysis: Supplemental files, figures, and code of phylogenetic analyses conducted within our manuscript
2_Fossil_Morphology_Scorings: Output files from different morphological scoring analyses conducted in order to infer the phylogenetic placement of S. brisaci and utility as a fossil calibration node within our time-divergence analysis
3_Time_Divergence_Analysis: Output from MCMCtree analysis placing S. brisaci fossil at different nodes throughout our tree to determine degree of divergence time variation
4_Biogeographic_Analysis: Outputs (txt, pdf) from biogeobears analysis of Somatochlora species
5_S_borisi_ScoringsMorphological scorings of 40 Somatochlora species using a subset of characters from Marinov and Seidenbusch 2007 to determine the validity of Corduliochlora
Data Formats
Excel/csv files:
- .csv: Utilized for organizing metadata of specimens used in phylogenetic sequencing, including taxonomy, author, locality, longitude, latitude, institution, and collection
Text Files:
- .txt: Contains output from parameter settings/output files from our post-hoc phylogenetic analyses, divergence analyses, biogeographic analyses (combined_branching_pattern_results.txt, mcmc.txt, Geography_File_Wallace.txt, and restable_AIC_rellike_formatted.txt)
Python/Shell/perl Scripts:
- .py/.sh/.pl: Custom-made scripts for changing reading frames, recoding nucleotide bases, or filtering loci within phylogenetic analyses (list.py, omit.sh)
Cluster Job Files:
- .job: SLURM job script submitted to the cluster, contains parameters of various phylogenetic and bayesian analyses
Phylogenetic Files:
- .tre: Output phylogenetic trees generated from IQ-tree, can be viewed using Figtree (Unaltered_ML_tree.tre)
- .newick: Output phylogenetic trees generated from IQ-tree can be viewed using Figtree (Somatochlora_Pruned.newick)
- .phy: Input alignment file in phylip format of AHE sequences used for sequences, can be opened using Aliview, or Mesquite
Time-Divergence Files:
- .ctl: Control files which define parameters for creating time-divergence analyses using the MCMCtree package in PAML (mcmctree.ctl)
- .BV: Pre-computed branch lengths and their associated variances for the specified tree topology in our time-divergence analysis (out.BV)
Additional Files:
- .out: Standard output file generated by jobs run on a SLURM-managed cluster. Provides inputted parameters for phylogenetic analyses and produced output files
- .nex: Nexus format data generated to store both molecular and morphological-data in text format
- .bionj/.gz/.iqtree/.log/.mldist/.parstree/.ufboot/.ckp/.mcmc/.parts/.p/.t/.probs/.tstat/.vstat/.txt/emf: various output files generated from Parsimony-based (tnt), Maximum Likelihood (IQ-tree), and Bayesian (MrBayes) phylogenetic analyses
- .pdf: Portable Document Formats of output graphics and figures within our biogeographic analysis
Recommended Software for Data Analysis
In order to run/reproduce analyses from 1_Phylogenetic_Analysis, and 2_Fossil_Morphology_Scorings requires the use of iqtree2, MrBayes, and TNT NOTE: TNT is not compatible with MACOSX user interface and requires the use of CYPRES, or Microsoft Parallels
In order to run/reproduce analyses from 3_Time_Divergence_Analysis requires the PAML command line package, using the MCMCtree command
In order to run/reproduce analyses from 4_Biogeographic_Analysis requires the statistical program R and RStudio, using the BioGeoBears Package
Usage, Compatibility, and Accessibility
This dataset encompasses a wide breadth of file formats, due to the span of analyses conducted within this manuscript, ranging from phylogenetic, morphological, divergence, and biogeographical.
The majority of these files can be viewed using a standard text editing software such as BBedit, as well as free image editing software (https://inkscape.org/) and all programs, code, and data are publicly available, and free to use.
Researchers are encouraged use original data formats, repeat analyses using our data for reliability and accuracy, and to email the corresponding author for any additional questions or confusions.
We are committed to providing support, clarity, and guidance to facilitate the effective use of these data, with the hopeful prospect of instilling curiosity and future analyses with not only our data, but with phylogenomic datasets.
Descriptions
Supplemental_Table_S1.csv
- Sequencing Code: Sequencing code used for specimen upon extraction of DNA from the hind leg and subsequent submission to RapidGenomics
- Family: Taxonomic rank of specimen at the family level
- Genus: Taxonomic rank of specimen at the genus level
- Species: Taxonomic rank of specimen at the species level (species epithet)
- Author: Author who first described the species
- Country: Country where specimen was captured, NA indicates that information was not present on label
- State/Province/Region: State/Province/Region where specimen was captured, NA indicates that information was not present on label
- County/Parish/Prefecture: County/Parish/Prefecture where specimen was captured, NA indicates that information was not present on label
- Locality: Descriptive locality where specimen was captured, NA indicates that information was not present on label
- Latitude: Latitude coordinate where specimen was captured, NA indicates that information was not present on label
- Longitude: Longitude coordinate where specimen was captured, NA indicates that information was not present on label
- Altitude: Altitude where specimen was captured, NA indicates that information was not present on label
- Sex: Sex of specimen (Male/Female), NA indicates that information was not present on label
- Day: Calendar day of specimen capture (1 - 31), NA indicates that information was not present on label
- Month: Calendar month of specimen capture (Specimen roman numeral notation, I - XII), NA indicates that information was not present on label
- Year: Calendar year of specimen origin (1939 - 2018), NA indicates that information was not present on label
- Collector: Collector of specimen, NA indicates that information was not present on label
- Det: Determiner of species identification, NA indicates that information was not present on label
- Institution: Institution where specimen is housed (FSCA, BYU, RMNH), NA indicates that information was not present on label
- Collection: Collection of specimen (If researcher donated specimens to museum, that specimen is part of that collection, D.R. Paulson Collection), NA indicates that information was not present on label
- Processed by: Processor of specimen into database of museum, NA indicates that information was not present on label
- Type: Specimen type status (Holotype, Paratype, Lectotype etc.), NA indicates that information was not present on label
- Notes: Additional notes on label for specimen which does not fit into aforementioned categories, NA indicates that information was not present on label
Supplemental_Table_S2.csv
- Diagnostic characters of S. brisaci from Nel et al. 1996: List of diagnostic characters of the fossil Somatochlora brisaci described by Nel et al. 1996 which separates the fossil Somatochlora brisaci from Antipodochlora braueri, and Guadalca insularis; all three species formed a polytomy within parsimony-based phylogenetic analyses of wing characters using MacClade
- Antipodochlora bauri: Diagnostic characters exhibited by the species Antipodochlora braueri
- Somatochlora: Diagnostic characters exhibited by the genus Somatochlora
- Guadalca insularis: Diagnostic characters exhibited by the genus Guadalca insularis
1_Phylogenetic_Analysis
RY_Coding_Code: Code used to change reading frames of loci to minimize stop codons, change all base pairs to R's and Y's or change third codon position to R's and Y's
- frame.py: Code used to change reading frames of loci to minimize stop codons*
- RY_code.py: Code used to convert all nucleotides of alignments of loci to R's and Y's *
- third.py: Code used to convert third codon position of nucleotides of all loci to R's and Y's*
*Code was created using the assistance of ChatGPT (OpenAI. (2023). ChatGPT (Mar 14 version) [Large language model]. https://chat.openai.com/chat)
*NOTE: Code was custom-built with names and formats of data based on naming schemes from Breinholt et al. 2018. As such, formatting to code to other datasets might require alteration
S_georgiana_Loci_Pruned: Here you will find all of the maximum likelihood (IQTREE) and Coalescent (ASTRAL) and RY-coding phylogenetic analyses conducted after retaining sequences of S. georgiana from loci where gene trees recover it within Somatochlora with high bootstrap support. In total, we retained 37/92 loci for S. georgiana within this dataset
No_Coded_Astral_.tre: Coalescent-based (ASTRAL) tree of alignment using nucleotides without any RY-coding or reading frame changes to minimize stop codons
No_Coded_Bootstrap.tre: Maximum Likelihood (IQTREE) tree using nucleotides without any RY-coding or reading frame changes to minimize stop codons
No_Coded_concordance.tre: Maximum Likelihood (IQTREE) tree with gene concordance factors (gCF) and site concordance factors (sCF) without any RY-coding or reading frame changes to minimize stop codons
RY_all_Astral.tre: Coalescent-based (ASTRAL) tree of alignment after changing reading frame of all loci to minimize stop codons, and replacing all nucleotides to R's and Y's
RY_all_Bootstrap.tre: Maximum Likelihood (IQTREE) tree of alignment after changing reading frame of all loci to minimize stop codons, and replacing all nucleotides to R's and Y's
RY_all_concordance.tre: Maximum Likelihood (IQTREE) tree with gene concordance factors (gCF) and site concordance factors (sCF) after changing reading frame of all loci to minimize stop codons, and replacing all nucleotides to R's and Y's
RY_third_Astral.tre: Coalescent-based (ASTRAL) tree of alignment after changing reading frame of all loci to minimize stop codons, and replacing purine nucleotides to R's and pyrimidines to Y's at the third codon position
RY_third_Bootstrap.tre: Maximum Likelihood (IQTREE) tree of alignment after changing reading frame of all loci to minimize stop codons, and replacing purine nucleotides to R's and pyrimidines to Y's at the third codon position
RY_third_concordance.tre: Maximum Likelihood (IQTREE) tree with gene concordance factors (gCF) and site concordance factors (sCF) after changing reading frame of all loci to minimize stop codons, and replacing purine nucleotides to R's and pyrimidines to Y's at the third codon position
S_georgiana_trimming_files: Python and bash scripts used to identify loci whose gene trees recover Somatochlora georgiana as being sister to another Somatochlora species, and removes sequences from loci which do not from our aligned loci
- list.py: Custom-built python script which 1. Parses gene trees (.contree) using a list of loci names (locus_names.txt), 2. uses the name GEODE17841_Corduliidae_Somatochlora_georgiana from a txt file (Somatochlora.txt), 3. cycles through the gene trees to find which trees show S. georgiana being sister to Somatochlora species or not
- combined_branching_pattern_results.txt: Output file from list.py which shows the sister taxa of S. georgiana for the gene trees of all loci, as well as its bootstrap support value. At the bottom of the txt file shows which loci have gene trees which have S. georgiana sister to a somatochlora species or not
- omit.sh: bash script to cycle through loci in a file (GEODE17841_Corduliidae_Somatochlora_georgiana_loci.txt), and to remove sequences of S. georgiana from a series of loci within the directory
- singleline.pl: Perl script which converts multiline fasta to singleline (used in omit.sh), from Breinholt et al. (2018)
Unaltered_Analysis: Here you will find all the maximum likelihood (IQTREE) phylogenetic analyses conducted before retaining sequences of S. georgiana from loci where it is recovered within Somatochlora with high bootstrap support. Consider this the unaltered phylogenetic analysis
- Unaltered_ML_tree.tre: Maximum Likelihood (IQTREE) tree of alignment
- Blast_Files_S_georgiana: BLAST files (.out format) using the 'blastn' command of aligned extracted sequences of S. georgiana from our loci. We blasted 92 sequences -- L6_blast_georgiana.out: Output BLAST file for locus #6 of aligned extracted sequences of S. georgiana, the remaining files are in this same format
2_Fossil_Morphology_Scorings
1_Morphology_Scorings: Wing trait characters scored for representative taxa within our phylogenetic analysis, including S. brisaci
- Combined_Scorings.nex: Combined Wing trait characters scored from Ware 2008 and Nel 1996 for representative taxa within our phylogenetic analysis, including S. brisaci
- Nel_1996_Scorings.nex: Wing trait characters scored from Nel 1996 for representative taxa within our phylogenetic analysis, including S. brisaci
- Ware_2008_Scorings.nex: Wing trait characters scored from Ware 2008 for representative taxa within our phylogenetic analysis, including S. brisaci
2_Morphology_Scorings_Ancestral_Reconstruction: Parsimony-based ancestral state reconstructions of Wing trait characters scored for representative taxa within our phylogenetic analysis, excluding S. brisaci
- Combined_Scorings.nex: Parsimony-based ancestral state reconstructions of Wing trait characters scored for representative taxa within our phylogenetic analysis, excluding S. brisaci using both matrices from Ware 2008 and Nel 1996
- Nel_1996_Scorings.nex: Parsimony-based ancestral state reconstructions of Wing trait characters scored for representative taxa within our phylogenetic analysis, excluding S. brisaci using both matrices from Nel 1996
- Ware_2008_Scorings.nex: Parsimony-based ancestral state reconstructions of Wing trait characters scored for representative taxa within our phylogenetic analysis, excluding S. brisaci using both matrices from Ware 2008
- Tree_Renamed.newick: Maximum Likelihood (IQTREE) tree of representative taxa (Same as No_Coded_ML_Tree.tre, See 1_Phylogenetic_Analysis) in newick format.
3_Homologuous_Trait_Scorings: Maximum Likelihood (IQTREE): Bayesian (MrBayes), and Parsimony (TNT) based phylogenetic analysis of wing trait data which possessed homology within our Parsimony-based ancestral state reconstructions (See 2_Morphology_Scorings_Ancestral_Reconstruction), S. brisaci is included
ML_Analysis: Output from Maximum Likelihood (IQTREE) analysis of Homologous wing trait characters, including S. brisaci
- Homologue_Scorings_No_Geode_Names_ML.phy.uniqueseq.phy: The alignment file containing only unique sequences (character states), created if IQ-TREE detected duplicate sequences in the original input
- Homologue_Scorings_No_Geode_Names_ML.phy: Full character state alignment of all taxa, including duplicate sequences
- Homologue_Scorings_No_Geode_Names_ML.phy.bionj: Contains the BioNJ tree, which is a neighbor-joining tree used as the starting point for the maximum likelihood search
- Homologue_Scorings_No_Geode_Names_ML.phy.ckp.gz: A compressed checkpoint file that allows you to resume a stopped or interrupted analysis from the last saved state
- Homologue_Scorings_No_Geode_Names_ML.phy.iqtree: The model and summary statistics file, containing information about the best-fitting model, the likelihood score, and branch support values
- Homologue_Scorings_No_Geode_Names_ML.phy.log: The log file, which records the progress of the analysis, model selection, tree search, and other details. Useful for checking the steps IQ-TREE performed
- Homologue_Scorings_No_Geode_Names_ML.phy.mldist: A pairwise maximum likelihood distance matrix, showing evolutionary distances between sequences in the alignment
- Homologue_Scorings_No_Geode_Names_ML.phy.parstree: The parsimonious tree, inferred using a parsimony-based algorithm as an alternative method or starting tree
- Homologue_Scorings_No_Geode_Names_ML.phy.splits.nex: A splits network file in NEXUS format, used for visualizing conflicting signal in the alignment or supporting splits
- Homologue_Scorings_No_Geode_Names_ML.phy.tre: An additional tree file in Newick format, often the same as the treefile but generated for specific downstream compatibility (opened in FigTree)
- Homologue_Scorings_No_Geode_Names_ML.phy.treefile: The maximum likelihood (ML) tree, which is the main result of the analysis. This file contains the best-scoring tree inferred by IQ-TREE, including branch lengths and support values
- Homologue_Scorings_No_Geode_Names_ML.phy.ufboot: Contains the ultrafast bootstrap replicates. If you ran a bootstrap analysis, this file stores the replicates used to calculate branch support values (1000)
Mr_Bayes Analysis: Output from Bayesian (MrBayes) analysis of Homologous wing trait characters, including S. brisaci
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex: This is your NEXUS input file, which contains the data (e.g., character states) and commands to run the MrBayes analysis
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.ckp: A checkpoint file that stores the state of the MCMC run at the last saved point. If the run is interrupted, you can resume from this file instead of restarting the analysis
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.ckp~: A backup of the previous checkpoint file, useful in case the most recent checkpoint file becomes corrupted
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.con.tre: This file contains the consensus tree, typically a majority-rule tree summarizing the posterior distribution of sampled trees. It includes branch lengths and posterior probabilities
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.mcmc: A log of the MCMC run, including details about acceptance rates, likelihood values, and parameter sampling. Useful for diagnostics
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.parts: A file summarizing the partitions used in the analysis if you specified partitioned models (e.g., MK-model of discrete character state evolution)
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.run1.p: Files containing the sampled parameter values for the first and second runs of the MCMC analysis. These are used to check convergence and summarize posterior distributions
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.run1.t: Files containing the sampled trees from the first and second runs of the MCMC analysis. These are used to compute the posterior distribution of trees
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.run2.p: Files containing the sampled parameter values for the first and second runs of the MCMC analysis. These are used to check convergence and summarize posterior distributions
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.run2.t: Files containing the sampled trees from the first and second runs of the MCMC analysis. These are used to compute the posterior distribution of trees
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.trprobs: A summary of the posterior probabilities for the sampled trees. This file ranks trees based on their posterior probabilities and is helpful for evaluating the posterior tree space
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.tstat: Tree statistics file, summarizing statistics like average branch lengths or tree likelihoods from the posterior sample of trees
- Homologue_Scorings_No_Geode_Names_Mr_Bayes.nex.vstat: Variance statistics file, summarizing the variance in the posterior distribution for the parameters and branch lengths
TNT_Analysis: Output from Parsimony (TNT) analysis of Homologous wing trait characters, including S. brisaci
- Homologue_Scorings_No_Geode_Names_TNT.tnt: input file for TNT file generated from Mesquite, containing the character matrix for phylogenetic analysis, as well as commands or scripts to specify analysis settings (e.g., search algorithms, constraints, and output options)
- TNT_Output.emf: Graphical output file in Enhanced Metafile Format (EMF). TNT can generate trees or other visualizations, a graphical representation of the results (e.g., cladograms or consensus trees), can be opened in adobe photoshop or Inkscape.
3_Time_Divergence_Analysis
Antipodochlora: Output from MCMCtree analysis placing S. brisaci fossil node of Antipodochlora
approx01: First Run of from MCMCtree analysis placing S. brisaci fossil node of Antipodochlora (Using Hessian matrix of branch lengths)
- Antipodochlora_divergence.tre: Dated tree output where placing S. brisaci fossil node of Antipodochlora, with estimated divergence times for nodes, posterior means or medians of node ages, with branch lengths scaled according to these times
- Antipodochlora.newick: Input tree placing S. brisaci fossil node of Antipodochlora, tree is ultrametric where nodes are labeled with fossil ages
- FcC_smatrix_N.phy: Input alignment file of AHE sequences used for ML tree (No_Coded_Bootstrap.tre)
- in.BV: Pre-computed branch lengths and their associated variances for the specified tree topology. This information is derived from the maximum likelihood estimation of branch lengths and rate parameters based on the sequence data
- mcmc.txt: Raw MCMC samples from the posterior distribution
- mcmctree.ctl: Control file for the MCMCTree analysis, containing the tree topology or starting tree, MCMC parameters (e.g., burn-in, chain length, sample frequency), and priors for divergence times, clock models, and rates
- mcmctree.job: SLURM job script submitted to the cluster, Contains: Job configuration (e.g., wall time, memory, CPUs), and the command to run MCMCTree.
- out: main output log summarizing the MCMC run, includes details on priors, posterior estimates, acceptance rates, and summary statistics (means, medians, 95% HPDs for parameters like node ages)
- SeedUsed: A log of the random seed used for the analysis
- slurm-65488987.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
approx02: Second Run of from MCMCtree analysis placing S. brisaci fossil node of Antipodochlora (Using Hessian matrix of branch lengths)
- Antipodochlora_divergence.tre: Dated tree output where placing S. brisaci fossil node of Antipodochlora, with estimated divergence times for nodes, posterior means or medians of node ages, with branch lengths scaled according to these times
- Antipodochlora.newick: Input tree placing S. brisaci fossil node of Antipodochlora, tree is ultrametric where nodes are labeled with fossil ages
- FcC_smatrix_N.phy: Input alignment file of AHE sequences used for ML tree (No_Coded_Bootstrap.tre)
- in.BV: Pre-computed branch lengths and their associated variances for the specified tree topology. This information is derived from the maximum likelihood estimation of branch lengths and rate parameters based on the sequence data
- mcmc.txt: Raw MCMC samples from the posterior distribution
- mcmctree.ctl: Control file for the MCMCTree analysis, containing the tree topology or starting tree, MCMC parameters (e.g., burn-in, chain length, sample frequency), and priors for divergence times, clock models, and rates
- mcmctree.job: SLURM job script submitted to the cluster, Contains: Job configuration (e.g., wall time, memory, CPUs), and the command to run MCMCTree.
- out: main output log summarizing the MCMC run, includes details on priors, posterior estimates, acceptance rates, and summary statistics (means, medians, 95% HPDs for parameters like node ages)
- SeedUsed: A log of the random seed used for the analysis
- slurm-65490015.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
Guadalca: Output from MCMCtree analysis placing S. brisaci fossil node of Guadalca insularis
approx01: First Run of from MCMCtree analysis placing S. brisaci fossil node of Guadalca (Using Hessian matrix of branch lengths)
- Guadalca_divergence.tre: Dated tree output where placing S. brisaci fossil node of Guadalca, with estimated divergence times for nodes, posterior means or medians of node ages, with branch lengths scaled according to these times
- Guadalca.newick: Input tree placing S. brisaci fossil node of Guadalca, tree is ultrametric where nodes are labeled with fossil ages
- FcC_smatrix_N.phy: Input alignment file of AHE sequences used for ML tree (No_Coded_Bootstrap.tre)
- in.BV: Pre-computed branch lengths and their associated variances for the specified tree topology. This information is derived from the maximum likelihood estimation of branch lengths and rate parameters based on the sequence data
- mcmc.txt: Raw MCMC samples from the posterior distribution
- mcmctree.ctl: Control file for the MCMCTree analysis, containing the tree topology or starting tree, MCMC parameters (e.g., burn-in, chain length, sample frequency), and priors for divergence times, clock models, and rates
- mcmctree.job: SLURM job script submitted to the cluster, Contains: Job configuration (e.g., wall time, memory, CPUs), and the command to run MCMCTree.
- out: main output log summarizing the MCMC run, includes details on priors, posterior estimates, acceptance rates, and summary statistics (means, medians, 95% HPDs for parameters like node ages)
- SeedUsed: A log of the random seed used for the analysis
- slurm-65488995.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
- slurm-65490016.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
approx02: Second Run of from MCMCtree analysis placing S. brisaci fossil node of Guadalca (Using Hessian matrix of branch lengths)
- Guadalca_divergence.tre: Dated tree output where placing S. brisaci fossil node of Guadalca, with estimated divergence times for nodes, posterior means or medians of node ages, with branch lengths scaled according to these times
- Guadalca.newick: Input tree placing S. brisaci fossil node of Guadalca, tree is ultrametric where nodes are labeled with fossil ages
- FcC_smatrix_N.phy: Input alignment file of AHE sequences used for ML tree (No_Coded_Bootstrap.tre)
- in.BV: Pre-computed branch lengths and their associated variances for the specified tree topology. This information is derived from the maximum likelihood estimation of branch lengths and rate parameters based on the sequence data
- mcmc.txt: Raw MCMC samples from the posterior distribution
- mcmctree.ctl: Control file for the MCMCTree analysis, containing the tree topology or starting tree, MCMC parameters (e.g., burn-in, chain length, sample frequency), and priors for divergence times, clock models, and rates
- mcmctree.job: SLURM job script submitted to the cluster, Contains: Job configuration (e.g., wall time, memory, CPUs), and the command to run MCMCTree.
- out: main output log summarizing the MCMC run, includes details on priors, posterior estimates, acceptance rates, and summary statistics (means, medians, 95% HPDs for parameters like node ages)
- SeedUsed: A log of the random seed used for the analysis
- slurm-65683648.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
No_Fossil: Output from from MCMCtree analysis not including S. brisaci as a fossil Calibration point (Using Hessian matrix of branch lengths)
approx01: First Run of from MCMCtree analysis not including S. brisaci as a fossil Calibration point (Using Hessian matrix of branch lengths)
- No_Fossil_divergence.tre: Dated tree output not including S. brisaci as a fossil calibration point, with estimated divergence times for nodes, posterior means or medians of node ages, with branch lengths scaled according to these times
- No_Fossil.newick: Input tree not including S. brisaci as a fossil calibration point, tree is ultrametric where nodes are labeled with fossil ages
- FcC_smatrix_N.phy: Input alignment file of AHE sequences used for ML tree (No_Coded_Bootstrap.tre)
- in.BV: Pre-computed branch lengths and their associated variances for the specified tree topology. This information is derived from the maximum likelihood estimation of branch lengths and rate parameters based on the sequence data
- mcmc.txt: Raw MCMC samples from the posterior distribution
- mcmctree.ctl: Control file for the MCMCTree analysis, containing the tree topology or starting tree, MCMC parameters (e.g., burn-in, chain length, sample frequency), and priors for divergence times, clock models, and rates
- mcmctree.job: SLURM job script submitted to the cluster, Contains: Job configuration (e.g., wall time, memory, CPUs), and the command to run MCMCTree.
- out: main output log summarizing the MCMC run, includes details on priors, posterior estimates, acceptance rates, and summary statistics (means, medians, 95% HPDs for parameters like node ages)
- SeedUsed: A log of the random seed used for the analysis
- slurm-65488985.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
- slurm-65488989.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
approx02: Second Run of from MCMCtree analysis not including S. brisaci as a fossil Calibration point (Using Hessian matrix of branch lengths)
- No_Fossil_divergence.tre: Dated tree output not including S. brisaci as a fossil calibration point, with estimated divergence times for nodes, posterior means or medians of node ages, with branch lengths scaled according to these times
- No_Fossil.newick: Input tree not including S. brisaci as a fossil calibration point, tree is ultrametric where nodes are labeled with fossil ages
- FcC_smatrix_N.phy: Input alignment file of AHE sequences used for ML tree (No_Coded_Bootstrap.tre)
- in.BV: Pre-computed branch lengths and their associated variances for the specified tree topology. This information is derived from the maximum likelihood estimation of branch lengths and rate parameters based on the sequence data
- mcmc.txt: Raw MCMC samples from the posterior distribution
- mcmctree.ctl: Control file for the MCMCTree analysis, containing the tree topology or starting tree, MCMC parameters (e.g., burn-in, chain length, sample frequency), and priors for divergence times, clock models, and rates
- mcmctree.job: SLURM job script submitted to the cluster, Contains: Job configuration (e.g., wall time, memory, CPUs), and the command to run MCMCTree.
- out: main output log summarizing the MCMC run, includes details on priors, posterior estimates, acceptance rates, and summary statistics (means, medians, 95% HPDs for parameters like node ages)
- SeedUsed: A log of the random seed used for the analysis
- slurm-65490017.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
- slurm-65683647.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
Somatochlora: Output from MCMCtree analysis placing S. brisaci fossil node of Somatochlora
approx01: First Run of from MCMCtree analysis placing S. brisaci fossil node of Somatochlora (Using Hessian matrix of branch lengths)
- Somatochlora_divergence.tre: Dated tree output where placing S. brisaci fossil node of Somatochlora, with estimated divergence times for nodes, posterior means or medians of node ages, with branch lengths scaled according to these times
- Somatochlora.newick: Input tree placing S. brisaci fossil node of Somatochlora, tree is ultrametric where nodes are labeled with fossil ages
- FcC_smatrix_N.phy: Input alignment file of AHE sequences used for ML tree (No_Coded_Bootstrap.tre)
- in.BV: Pre-computed branch lengths and their associated variances for the specified tree topology. This information is derived from the maximum likelihood estimation of branch lengths and rate parameters based on the sequence data
- mcmc.txt: Raw MCMC samples from the posterior distribution
- mcmctree.ctl: Control file for the MCMCTree analysis, containing the tree topology or starting tree, MCMC parameters (e.g., burn-in, chain length, sample frequency), and priors for divergence times, clock models, and rates
- mcmctree.job: SLURM job script submitted to the cluster, Contains: Job configuration (e.g., wall time, memory, CPUs), and the command to run MCMCTree.
- out: main output log summarizing the MCMC run, includes details on priors, posterior estimates, acceptance rates, and summary statistics (means, medians, 95% HPDs for parameters like node ages)
- SeedUsed: A log of the random seed used for the analysis
- slurm-65488997.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
- slurm-65490018.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
approx02: Second Run of from MCMCtree analysis placing S. brisaci fossil node of Somatochlora (Using Hessian matrix of branch lengths)
- Somatochlora_divergence.tre: Dated tree output where placing S. brisaci fossil node of Somatochlora, with estimated divergence times for nodes, posterior means or medians of node ages, with branch lengths scaled according to these times
- Somatochlora.newick: Input tree placing S. brisaci fossil node of Somatochlora, tree is ultrametric where nodes are labeled with fossil ages
- FcC_smatrix_N.phy: Input alignment file of AHE sequences used for ML tree (No_Coded_Bootstrap.tre)
- in.BV: Pre-computed branch lengths and their associated variances for the specified tree topology. This information is derived from the maximum likelihood estimation of branch lengths and rate parameters based on the sequence data
- mcmc.txt: Raw MCMC samples from the posterior distribution
- mcmctree.ctl: Control file for the MCMCTree analysis, containing the tree topology or starting tree, MCMC parameters (e.g., burn-in, chain length, sample frequency), and priors for divergence times, clock models, and rates
- mcmctree.job: SLURM job script submitted to the cluster, Contains: Job configuration (e.g., wall time, memory, CPUs), and the command to run MCMCTree.
- out: main output log summarizing the MCMC run, includes details on priors, posterior estimates, acceptance rates, and summary statistics (means, medians, 95% HPDs for parameters like node ages)
- SeedUsed: A log of the random seed used for the analysis
- slurm-65683645.out: Standard output logs from SLURM, the job scheduler, contains messages from the MCMCTree run including command-line arguments, progress information (e.g., burn-in completion, sampling progress), and potential errors or warnings
4_Biogeographic_Analysis:
- Geography_File_Wallace.txt: Geography file of scored locality data of Somatochlora species for Biogeographic analysis (A = Oriental, B = palearctic, C = Nearctic)
- restable_AIC_rellike_formatted.txt: Formatted table of AIC values from biogeographic analysis using DIVA, DEC, and BAYAREA like models, with jump parameter (+j)
- restable_AICc_rellike_formatted.txt: Formatted table of AICc values from biogeographic analysis using DIVA, DEC, and BAYAREA like models, with jump parameter (+j)
- teststable.txt: Formatted table of p-values between constrained and unconstrained (+j) models in biogeographic analysis
- Somatochlora_BAYAREALIKE_vs_BAYAREALIKE+J_M0_unconstrained_v1.pdf: PDF of BAYAREA and BAYAREA + J reconstructions use pie charts, and consensus
- Somatochlora_DEC_vs_DEC+J_M0_unconstrained_v1.pdf: PDF of DEC and DEC + J reconstructions use pie charts, and consensus
- Somatochlora_DIVALIKE_vs_DIVALIKE+J_M0_unconstrained_v1.pdf: PDF of DIVA and DIVA + J reconstructions use pie charts, and consensus
- Somatochlora_pruned.newick: Time-calibrated reconstruction tree of Somatochlora species, with outgroups pruned (tree still retains mean and HBD of divergence times of taxa)
5_S_borisi_Scorings:
- Marinov_Scorings.nex: Parsimony-based ancestral reconstructions of morphological scorings of 40 Somatochlora species using a subset of characters from Marinov and Seidenbusch 2007 to determine the validity of Corduliochlora
- Somatochlora_ML_tree.newick: Maximum Likelihood tree of Somatochlora species pruned from larger dataset in newick format
Methods
Taxon Sampling
We acquired specimens of Somatochlora from natural history collections. Specimens sampled from collections in the American Museum of Natural History (AMNH), Florida State Collection of Arthropods (FSCA), Naturalis Biodiversity Center (RMNH), National Museum of Natural History Museum (NMNH), and Monte L. Bean Life Sciences Museum at Brigham Young University (BYU). In total, we sampled 40 of the 42 current species of Somatochlora. We selected a large number of outgroups to establish a robust phylogenetic placement for the Somatochlora. We sampled outgroups from Corduliidae (Hemicordulia, Procordulia, Guadalca, Paracordulia, Neurocordulia, Navicordulia, Helocordulia, Cordulia, Epitheca, Metaphya, Pentathemis, Aeschnosoma, and Antipodochlora), other established families within the superfamily Libelluloidea (Libellulidae: Pantala, Libellula, Orthetrum, Macromiidae: Macromia, Epophthalmia, Synthemistidae: Eusynthemis, Choristhemis, Gomphomacromia), and families within Cavilabiata (Chlorogomphidae: Chlorogomphus, Cordulegastridae: Cordulegaster, Anotogaster, Neopetalia: Neopetalia punctata). Specimen provenance data including locality, date, author, collector, and determiner, are listed in Supplemental Table S1.
DNA Extraction and Sequencing:
We removed the hind leg from individual specimens of each species using sterilized forceps, and extracted DNA using ZYMOBIOMICS DNA miniprep kits (Irvine, CA). We quantified DNA yield using a Qubit 4 fluorometer, and sent DNA extracts to RAPID Genomics (Gainesville, Florida) for library preparation and sequencing. Loci were amplified using Anchored Hybrid Enrichment (AHE) probes modified from Bybee et al. (2021), consisting of 1,306 loci (Goodman et al. 2023). Probes sets were originally created by scanning for 941 exons commonly shared across insects using published data from 24 odonate transcriptomes (Futahashi et al. 2015, Suvorov et al. 2017) as well as two assembled genomes from Bybee et al. (2021). An additional 211 functional loci were sequenced, focusing on vision, flight, and immunity (Bybee et al. 2021; Goodman et al. 2023). We sequenced loci of representatives of each genus using the full 1,306 probe set (500kb), while a subset of 92 loci (20kb) was sequenced for the remaining species. Raw AHE reads can be obtained from Dryad digital repository number: https://datadryad.org/stash/share/nII28qPnpxLswmsOQC0Nxj6JP_HXeF1DmaI0H8vgZlU,
while loci coverage for each species can be obtained from Supplemental Table S1.
AHE Assembly and Analysis:
We trimmed adaptors from raw reads using fastp (Tang and Wong 2001) and checked for quality using multiQC (Ewels et al. 2016). We followed methods outlined in (Breinholt et al. 2018) to assemble and assign orthology to each target capture locus with a few modifications. In brief, we assembled each locus individually using iterative baited assembly with SPAdes (Prjibelski et al. 2020) and reference loci from the chromosome-length genome assembly of Tanypteryx hageni (Petaluridae) (Tolman et al. 2023b). We then screened each locus for orthology by first ensuring that the locus did not have BLAST hits to multiple places in the genome, and secondly, by ensuring best reciprocal hits between the reference and the query sequence. We performed subsequent analyses using assemblies of the probe region of our loci, as preliminary analyses recovered increased noise and reduced phylogenetic support using probe + flanking regions if a low number of loci was recovered from any taxon, as well as flanking regions expressing high variability in alignment.
Phylogenetic Analysis:
We generated multiple sequence alignments for each locus using the ‘MAFFT-linsi’ algorithm in MAFFT v.7.475 (Katoh and Standley 2013), and trimmed alignments using a 0.75 threshold cutoff using trimAI v1.2 (Capella-Gutiérrez et al. 2009). We concatenated the alignment using FASconCAT v1.11 (Kück and Meusemann 2010), and generated an initial partitioning scheme using relaxed clustering with the model fixed to GTR + G for each subset in IQtree v2.1.3 (Minh et al. 2020b). We then selected the best nucleotide substitution model for each subset in the partitioning scheme using ModelFinder and estimated a maximum likelihood tree (ML). We estimated branch support using SH-like approximate likelihood ratio tests (SH-aLRT) and 1,000 ultrafast bootstrap replicates (UFboot) in IQtree v2.1.3 (Guindon et al. 2010, Kalyaanamoorthy et al. 2017). To assess the degree of incomplete lineage sorting (ILS), we first reconstructed ML trees for each locus with 1,000 ultrafast bootstrap replicates and performed a coalescent-based species tree estimation in ASTRAL2 v5.6.1 using local posterior probabilities (LPP) (Mirarab and Warnow 2015). As an additional metric to assess the degree of ILS as well as introgression (hybridization), we calculated gene concordance factors (gCF) and site concordance factors (sCF), using our concatenated ML tree and our loci trees (Minh et al. 2020a); gCF and sCF calculates the proportion of genes and informative sites respectively, which support the bipartition (split) defined by the branches within our ML tree (Minh et al. 2020a). We identify nodes of high support possessing bootstrap (BS) and SH-aLRT values > 90, and LPP values >0.90. We identify regions of high gene and site concordance (gCF and sCF respectively) possessing values >0.7. We rooted the tree using Neopetaliidae.
Post-Hoc Modifications of Phylogeny
Preliminary phylogenetic analyses recovered S. georgiana (Coppery Emerald) as sister to Libellulidae with mixed support (SH-alrt: 100 UFBoot: 100, LPP: 0.39). We checked for contaminants among aligned loci sequenced for S. georgiana by blasting them to the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) Database using the blastn command (Chen et al. 2015). Top predicted identity blast hits (percent identity) for our loci belonged to annotated assemblies of Zygoptera including Ischnura elegans (Price et al. 2022), Pantala flavescens (Liu et al. 2022), as well as other arthropod genomes including Megalopta genalis (Hymenoptera: Halictidae), Zootermopsis nevadensis (Blattodea: Archotermopsidae), Cryptotermes secundus (Blattodea: Kalotermitidae), and Neodiprion fabricii (Hymenoptera: Tenthredinoidea) (Terrapon et al. 2014, Jones et al. 2015, Kapheim et al. 2020, Lin et al. 2021, Herrig et al. 2023). Furthermore, 24 of our loci did not recover any blast hits, most likely due to locus fragment motifs being unique among Anisoptera (Bybee et al. 2021, Goodman et al. 2023), and lacking arthropod homologues within Genbank. However, S. hineana was recovered as the top predicted blast hit for two legacy genes (Cytochrome Oxygenase 1: Locus 2001, and Cytochrome Oxygenase b: Locus 2000), suggesting genes are still usable to place S. georgiana within the genus. To utilize S. georgiana within our analysis, we first reconstructed ML trees for each locus with 1,000 ultrafast bootstrap replicates. Using a custom-built python script, we 1. rooted each gene tree with Neopetaliidae, 2. determined the most closely related taxa to S. georgiana using branch length as a proxy, 3. calculated the bootstrap support values of S. georgiana + sister taxon, 4. retained the sequences of S. georgiana from loci where it is sister to a Somatochlora species with high bootstrap support, 5. realigned, trimmed, and concatenated our revised locus list as outlined in our methods 6. reran our ML phylogenetic tree in IQtree. We verified the results of our code through visual inspection of our gene trees. In total, we retained 37/92 loci for S. georgiana, recovering it within Somatochlora with high support across metrics (SH-alrt: 100 UFBoot: 100, LPP: 1.0). As this method of loci pruning is new, and heavily dependent on correct taxonomy of species, we verified our specimen of S. georgiana using dichotomous keys from Garrison et al. (2006) and Walker (1925). All subsequent analyses utilize this new pruned-loci dataset, which herein will be referred to as our ‘no-coded’ dataset.
Preliminary analyses also recovered discrepancies in topology of Somatochlora between our no-coded ML and ASTRAL trees suggesting ILS and hybridization, as also made evident by our low gCF and sCF scores across Somatochlora (Fig. 1). However, we also hypothesize that these differences are the result of 3rd codon site saturation, where the signal to noise ratio is skewed, reducing resolution (Yang 1996, Simmons et al. 2006, Parvathy et al. 2022). Codon usage bias has been more commonly observed high throughput molecular datasets as well as insect genomes (Behura and Severson 2012, 2013, Breinholt and Kawahara 2013, Sharma and Uddin 2014, Galtier et al. 2018). To test for 3rd codon site saturation, and using another custom-built python script, we conducted two analyses. We set the reading frame of our no-coded alignments of all our loci to reduce the amount of stop codons. Next, we 1. replaced the third-codon position to ‘R’ for purine nucleotides (A and G), and ‘Y’ for pyrimidines (C and T) across all our loci (herein referred to as ‘RY-third’ analysis) 2. recoded all nucleotides as R’s or Y’s for all of our loci (herein referred to as ‘RY-all’ analysis). 3. for each of these two datasets, we realigned, trimmed, and concatenated our loci as outlined in our methods. 4. reconstructed new ML and ASTRAL trees for both of our RY-coded loci datasets. RY-coding of nucleotides is common practice to reduce noise in data as it reflects signal coming from transitions and transversions (Woese et al. 1991, Phillips et al. 2004, Harshman et al. 2008, Kück and Meusemann 2010, White et al. 2011, Chen et al. 2014, Timmermans et al. 2016, Simmons 2017). Custom-built python and bash scripts were created with the assistance of ChatGPT (OpenAI. 2023) and provided in the Supplemental Information.
Fossil Selection and Time Divergence Analysis:
Taxonomy of odonate fossils relies predominantly on wing characters due to their high preservation potential, and plethora of venational traits (Fraser and Tillyard 1957). However, most researchers acknowledge that wing venation is highly prone to convergence and should be used in conjunction with other traits if available (Fraser and Tillyard 1957, Gloyd 1959, Hennig 1981, Fleck et al. 2008). Amber fossils of adult and nymphal Odonata are rare in the fossil record (Wighton and Wilson 1986, Bechly 1996, Karr and Clapham 2015, Schädel and Bechly 2016, Zheng and Jarzembowski 2020, Boudet et al. 2023)(See table 1 in Schaedel et al. (2020)) (paleodb.com), limiting analyses pertaining to accessory genitalic, thoracic, penile, or nymphal traits. Kohli et al. (2016) published a list of vetted fossil calibrations for Odonata, as part of the Fossil Calibration Database (fossilcalibrations.org), providing recommendations for fossil selection covering the breadth of taxa in our phylogeny. We chose fossil calibrations for the crown nodes for Cavilabiata, Macromiidae, Corduliidae, Libellulidae, and Corduliidae + Libellulidae. Phylogenetic and age justifications for divergence time estimation of our fossils are outlined in Kohli et al. (2016) and (Kohli et al. 2021b) (Table 1).
Fossil Validation
We surveyed two additional tentative Somatochlora fossils as calibration points, extending our sampling beyond Kohli et al. (2021b). We used the five principles outlined by (Parham et al. 2012) and Ksepka et al. (2015) as best calibration practices. In brief, the five criteria are as follows: 1. Fossil accession number for fossil and referrals, 2. Apomorphy-based or phylogenetic analysis, 3. Reconciliation of morphological and molecular data, 4. Locality and stratigraphic data for fossil taxa 5. Radioisotopic age or numeric age references for fossils.
Two putative fossils of Somatochlora exist, the older being S. oregonica Cockerell, 1927 from Central Oregon, which is estimated to be Oligocene in origin (33.9 – 28.4 Ma). However, we are skeptical of this identification since the fossil is only of the upper half of either the forewing or hindwing, from the nodus to the pterostigma, possessing the first radial vein (R1), and the first and second medial veins (M1 & M2) (Needham et al. 2000). Within his diagnosis, Cockerell (1927) draws similarities of the fossil to other species of Somatochlora by the first two postnodal cross veins distal to the pterostigma being obliquely angled, while the subnodal veins are perpendicular. Cockerell (1927) also states that doubling of cells occurs at the 7th cross vein between R1 and M2. However, these traits are quite variable among North American Somatochlora (Walker 1925, Needham 1930, Walker and Corbet 1975, Needham et al. 2000, Garrison et al. 2006). Cockerell (1927) also states that the first subnodal cell is very long, roughly equal to three postnodal cells. Although this is the case with some Somatochlora species, other North American corduliid genera also possess this trait including Dorocordulia, Epitheca, Helocordulia, and Neurocordulia (Needham et al. 2000, Garrison et al. 2006). Finally, Cockerell (1927) states that at the second cell below the pterostigma possesses an oblique cross vein which can be found in S. arctica. However, no Somatochlora species, including S. arctica possess a second cross vein below the pterostigma. Overall, we exclude this fossil for calibration, as the traits mentioned do not provide strong enough synapomorphies to place the fossil into Somatochlora.
The second fossil, Somatochlora brisaci (Nel et al. 1996) is significantly younger, with a Miocene origin (8.7 – 5.3 Ma), discovered in a deposit from Southeastern France. The fossil is a near-complete hindwing except for a few posterior regions of the wing margin missing near the third and fourth medial veins, the cubital (C) and anal (A) veins. Within the description, Nel et al. (1996) performed a parsimony analysis on the fossil, comparing it to other corduliid, synthemistid, and macromiid genera. The authors conclude that the S. brisaci does seem to be related to Somatochlora, but forms an unresolved trichotomy with the genera Antipodochlora and Guadalca. Somatochlora brisaci was temporarily attributed to Somatochlora because the arculus is midway between the first two antenodal crossveins, but the authors acknowledge several key traits which separate it from recent species; the most noticeable being the triangle possessing four cells (See Supplemental Table S2 and Appendix I).
Using our no-coded ML phylogeny, we performed several additional morphological analyses to verify the fossil placement of S. brisaci. Using two independent lists of wing trait characters from (Nel et al. 1996) and Ware (2008) we scored the wings of all extant species within our phylogeny. We traced each trait onto our ML phylogeny using Mesquite v3.81 (Maddison and Maddison 2007) using a parsimony-based reconstruction of character history, retaining characters which possessed high phylogenetic signal within families and genera, and excluding the ones which exhibited homoplasy (Retention Index > .70) (Farris 1989, Kälersjö et al. 1999). After removing homoplasious characters, we then combined both revised Ware et al. and Nel et al datasets; if the two datasets had similar characters, such redundancies were removed. We then scored the wing traits of S. brisaci using this morphological trait set and performed phylogenetic analyses using parsimony in TNT v1.6. (Goloboff et al. 2008), Bayesian Inference (BI) using MrBayes v3.2 (Huelsenbeck and Ronquist 2001), and Maximum Likelihood (ML) using IQtree v2.1.3 (See Supplemental Information). We applied an MK model of discrete character evolution for our BI and ML analyses (Lewis 2001).
Since fossil choice and placement can drastically affect the outcome of divergence time analysis (Kohli et al. 2021b), we ran divergence time estimation under four different scenarios. The first scenario designated as ‘No Fossil’ we exclude S. brisaci from our analysis for a total of six remaining fossil calibrations. The second scenario designated as ‘Somatochlora node’ we place S. brisaci on the node of the Somatochlora. The third scenario designated as ‘Guadalca node’ we place S. brisaci on the node of Guadalca. The fourth scenario designated as ‘Antipodochlora node’, we place S. brisaci on the node of Antipodochlora (Table 1).
All divergence time analyses were conducted on the nucleotide dataset in MCMCtree as implemented in the software package PAML v.4.7a (Yang 2007) using an ultrametric (equal branch lengths) version of our no-coded ML tree. We used our full unpartitioned dataset due to computational limits since our dataset consists of over 1000 loci. Fossil calibrations were set using uniform prior distributions with hard upper and lower bounds (Table 1). Our root maximum age was set at 158.1 million years, based on the earliest fossil within Cavilabiata (Juralibellula ningchengensis) (Huang and Nel 2007). We set default parameters for defining prior distribution and used the General Time Reversible (GTR) nucleotide substitution model for calculating the hessian matrix for our dataset. For each scenario, we performed two independent MCMC runs with 500,000 iterations, sampling every 100 trees with a 2000 tree burn-in, and checked for convergence using Tracer v. 1.6 (Drummond and Rambaut 2007). Finally, we examined prior distributions of each run to ensure reasonable fossil choices and placement on the tree (Warnock et al. 2012). Divergence time estimates and outputs from all the four scenarios are provided in our Supplemental Information.
Biogeographic Analysis:
We estimated the ancestral range of Somatochlora excluding outgroups with the maximum likelihood R package BioGeoBears v1.1.2 (Matzke 2013) using our no-coded Bayesian-estimated time-calibrated phylogeny from MCMCtree. We chose BioGeoBears due to its customization of dispersal rates, time stratification events, and comparison of different likelihood and Bayesian dispersal models. Furthermore, BioGeoBears incorporates a new parameter called the founder-event speciation (J), which allows for the possibility that a new population could colonize a new area via a ‘jumping dispersal event’ (Matzke 2013). We conducted the analysis three ways: (1) using the Dispersal Extinction Cladogenesis (DEC, DEC + j) model (Ree and Smith 2008), (2) using a likelihood implementation of the Dispersal-vicariance analysis (DIVA, DIVA + j) (Ronquist 1997), and (3) using a Bayesian-like implementation of area estimation (BayArea-like, BayArea + j) (Matzke 2013). Although (Ree and Sanmartín 2018) highlight conceptual and statistical issues with the DEC + j model, recent work has validated +j models as valid in AICc comparisons (Matzke 2022). Wallace’s biogeographic regions are commonly used when estimating ancestral ranges for insects, (Lohman et al. 2011, Toussaint et al. 2019a, Toussaint et al. 2019b, Toussaint et al. 2021a, Toussaint et al. 2021b, Tseng et al. 2022, Kawahara et al. 2023), but delineations among biogeographic regions may vary across studies; this is for many reasons, such as variable dispersal among taxa, and many insect taxa have evolutionary histories that predate modern continents (Olson et al. 2001, Holt et al. 2013). As such, we chose modified Wallacean biogeographic regions which are not only hypothesized as being broad geographic ranges of Somatochlora (Walker 1925, Walker and Corbet 1975, Allen et al. 1985), but have been used previously in inferring biodiversity of Odonata in the Nearctic and Palearctic regions (Abbott et al. 2022, Kalkman et al. 2022). We used 3 geographic ranges, A. IndoMalay region (Referred to as Indo-Malay in Olson et al. (2001) including the Chinese provinces of Sichuan, Hubei, Anhui, and Jiangsu), B. Eastern North Hemisphere (defined as Palearctic by Olson et al. (2001)), and C. Western North Hemisphere (defined as Nearctic by Olson et al. (2001), which does not include the three mountain ranges of northern and central Mexico.
We tested for statistical differences of constrained versus unconstrained models (ex: DIVA and DIVA + j), using the p-value of likelihood ratio test (LRT). We determined the most optimal model for explaining our data using the criterion of the highest negative log-likelihood (-LnL), and the lowest Akaike criterion (AIC). Furthermore, we performed an unstratified analysis (without areas allowed/adjacency files, dispersal matrices, maximum number of areas, and dispersal multipliers) due to the wide geographic range of Somatochlora.
Abbott, J. C., C. A. Bota-Sierra, R. Guralnick, V. Kalkman, E. González-Soriano, R. Novelo-Gutiérrez, S. Bybee, J. Ware, and M. W. Belitz. 2022. Diversity of Nearctic dragonflies and damselflies (Odonata). Diversity 14: 575.
Allen, D., L. Davies, and P. Tobin. 1985. The dragonflies of the world: A systematic list of the extant species of Odonata. Vol. 2 Anisoptera. Rapid communications 5: 8-151.
Bechly, G. 1996. Morphologische Untersuchungen am Flugelgeader der rezenten Libellen und deren Stammgruppenvertreter (Insecta; Pterygota; Odonata) unter besonderer Berucksichtigung der phylogenetischen Systematik und des Grundplanes der Odonata. Petalura 1-402.
Behura, S. K., and D. W. Severson. 2012. Comparative analysis of codon usage bias and codon context patterns between dipteran and hymenopteran sequenced genomes.
Behura, S. K., and D. W. Severson. 2013. Codon usage bias: causative factors, quantification methods and genome‐wide patterns: with emphasis on insect genomes. Biological Reviews 88: 49-61.
Boudet, L., A. Nel, and D. Huang. 2023. A new basal hawker dragonfly from the earliest Late Jurassic of Daohugou, northeastern China (Odonata: Anisoptera: Mesuropetalidae). Hist. Biol. 35: 1267-1273.
Breinholt, J. W., and A. Y. Kawahara. 2013. Phylotranscriptomics: saturated third codon positions radically influence the estimation of trees based on next-gen data. Genome Biol. Evol. 5: 2082-2092.
Breinholt, J. W., C. Earl, A. R. Lemmon, E. M. Lemmon, L. Xiao, and A. Y. Kawahara. 2018. Resolving relationships among the megadiverse butterflies and moths with a novel pipeline for anchored phylogenomics. Syst. Biol. 67: 78-93.
Bybee, S. M., V. J. Kalkman, R. J. Erickson, P. B. Frandsen, J. W. Breinholt, A. Suvorov, K.-D. B. Dijkstra, A. Cordero-Rivera, J. H. Skevington, and J. C. Abbott. 2021. Phylogeny and classification of Odonata using targeted genomics. Mol. Phylogenet. Evol. 160: 107115.
Capella-Gutiérrez, S., J. M. Silla-Martínez, and T. Gabaldón. 2009. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25: 1972-1973.
Chen, J.-N., J. A. López, S. Lavoué, M. Miya, and W.-J. Chen. 2014. Phylogeny of the Elopomorpha (Teleostei): evidence from six nuclear and mitochondrial markers. Mol. Phylogenet. Evol. 70: 152-161.
Chen, Y., W. Ye, Y. Zhang, and Y. Xu. 2015. High speed BLASTN: an accelerated MegaBLAST search tool. Nucleic Acids Res. 43: 7762-7768.
Cockerell, T. 1927. Tertiary fossil insects from eastern Oregon. Additions to the palaeontology of the Pacific coast and Great Basin regions of North America.’(Eds R. Kellogg, JC Merriam, C. Stock, RW Chaney, and HL Mason.) pp 65-138.
Drummond, A. J., and A. Rambaut. 2007. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7: 214.
Ewels, P., M. Magnusson, S. Lundin, and M. Käller. 2016. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32: 3047-3048.
Farris, J. S. 1989. The retention index and the rescaled consistency index. Cladistics 5: 417-419.
Fleck, G., B. Ullrich, M. Brenk, C. Wallnisch, M. Orland, S. Bleidissel, and B. Misof. 2008. A phylogeny of anisopterous dragonflies (Insecta, Odonata) using mtRNA genes and mixed nucleotide/doublet models. Journal of Zoological Systematics and Evolutionary Research 46: 310-322.
Fraser, F. C., and R. J. Tillyard. Reclassification of the order Odonata. In 1957. Royal Zoological Society of New South Wales.
Futahashi, R., R. Kawahara-Miki, M. Kinoshita, K. Yoshitake, S. Yajima, K. Arikawa, and T. Fukatsu. 2015. Extraordinary diversity of visual opsin genes in dragonflies. Proceedings of the National Academy of Sciences 112: E1247-E1256.
Galtier, N., C. Roux, M. Rousselle, J. Romiguier, E. Figuet, S. Glémin, N. Bierne, and L. Duret. 2018. Codon usage bias in animals: disentangling the effects of natural selection, effective population size, and GC-biased gene conversion. Mol. Biol. Evol. 35: 1092-1103.
Garrison, R. W., N. von Ellenrieder, and J. A. Louton. 2006. Dragonfly genera of the New World: an illustrated and annotated key to the Anisoptera. JHU Press.
Gloyd, L. K. 1959. Elevation of the Macromia group to family status (Odonata). Entomol. News 70: 197-205.
Goloboff, P. A., J. S. Farris, and K. C. Nixon. 2008. TNT, a free program for phylogenetic analysis. Cladistics 24: 774-786.
Goodman, A., E. Tolman, R. Uche-Dike, J. Abbott, J. W. Breinholt, S. Bybee, P. B. Frandsen, J. S. Gosnell, R. Guralnick, and V. J. Kalkman. 2023. Assessment of targeted enrichment locus capture across time and museums using odonate specimens. Insect Systematics and Diversity 7: 5.
Guindon, S., J.-F. Dufayard, V. Lefort, M. Anisimova, W. Hordijk, and O. Gascuel. 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59: 307-321.
Harshman, J., E. L. Braun, M. J. Braun, C. J. Huddleston, R. C. Bowie, J. L. Chojnowski, S. J. Hackett, K.-L. Han, R. T. Kimball, and B. D. Marks. 2008. Phylogenomic evidence for multiple losses of flight in ratite birds. Proceedings of the National Academy of Sciences 105: 13462-13467.
Hennig, W. 1981. Insect phylogeny. John Wiley & Sons Ltd.
Herrig, D. K., K. L. Vertacnik, R. D. Ridenbaugh, K. M. Everson, S. B. Sim, S. M. Geib, D. W. Weisrock, and C. R. Linnen. 2023. Whole genomes reveal evolutionary relationships and mechanisms underlying gene-tree discordance in Neodiprion sawflies. bioRxiv 2023.2001. 2005.522922.
Holt, B. G., J.-P. Lessard, M. K. Borregaard, S. A. Fritz, M. B. Araújo, D. Dimitrov, P.-H. Fabre, C. H. Graham, G. R. Graves, and K. A. Jønsson. 2013. An update of Wallace’s zoogeographic regions of the world. Science 339: 74-78.
Huang, D.-y., and A. Nel. 2007. Oldest'libelluloid'dragonfly from the Middle jurassic of China (Odonata: Anisoptera: Cavilabiata). Neues Jahrbuch für Geologie und Paläontologie-Abhandlungen 63-68.
Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17: 754-755.
Jones, B. M., W. T. Wcislo, and G. E. Robinson. 2015. Developmental transcriptome for a facultatively eusocial bee, Megalopta genalis. G3: Genes, Genomes, Genetics 5: 2127-2135.
Kälersjö, M., V. A. Albert, and J. S. Farris. 1999. Homoplasy increases phylogenetic structure. Cladistics 15: 91-93.
Kalkman, V. J., J.-P. Boudot, R. Futahashi, J. C. Abbott, C. A. Bota-Sierra, R. Guralnick, S. M. Bybee, J. Ware, and M. W. Belitz. 2022. Diversity of Palaearctic dragonflies and damselflies (Odonata). Diversity 14: 966.
Kalyaanamoorthy, S., B. Q. Minh, T. K. F. Wong, A. Von Haeseler, and L. S. Jermiin. 2017. ModelFinder: fast model selection for accurate phylogenetic estimates. Nature Methods 14: 587-589.
Kapheim, K. M., B. M. Jones, H. Pan, C. Li, B. A. Harpur, C. F. Kent, A. Zayed, P. Ioannidis, R. M. Waterhouse, and C. Kingwell. 2020. Developmental plasticity shapes social traits and selection in a facultatively eusocial bee. Proceedings of the National Academy of Sciences 117: 13615-13625.
Karr, J. A., and M. E. Clapham. 2015. Taphonomic biases in the insect fossil record: shifts in articulation over geologic time. Paleobiology 41: 16-32.
Katoh, K., and D. M. Standley. 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30: 772-780.
Kawahara, A. Y., C. Storer, A. P. S. Carvalho, D. M. Plotkin, F. L. Condamine, M. P. Braga, E. A. Ellis, R. A. St Laurent, X. Li, and V. Barve. 2023. A global phylogeny of butterflies reveals their evolutionary history, ancestral hosts and biogeographic origins. Nature Ecology & Evolution 1-11.
Kohli, M., H. Letsch, C. Greve, O. Béthoux, I. Deregnaucourt, S. Liu, X. Zhou, A. Donath, C. Mayer, and L. Podsiadlowski. 2021. Evolutionary history and divergence times of Odonata (dragonflies and damselflies) revealed through transcriptomics. Iscience 24.
Kohli, M. K., J. L. Ware, and G. Bechly. 2016. How to date a dragonfly: Fossil calibrations for odonates. Palaeontologia Electronica 19: 1-14.
Ksepka, D. T., J. F. Parham, J. F. Allman, M. J. Benton, M. T. Carrano, K. A. Cranston, P. C. Donoghue, J. J. Head, E. J. Hermsen, and R. B. Irmis. 2015. The fossil calibration database—a new resource for divergence dating. Syst. Biol. 64: 853-859.
Kück, P., and K. Meusemann. 2010. FASconCAT: convenient handling of data matrices. Mol. Phylogenet. Evol. 56: 1115-1118.
Lewis, P. O. 2001. A Likelihood Approach to Estimating Phylogeny from Discrete Morphological Character Data. Syst. Biol. 50: 913-925.
Lin, S., J. Werle, and J. Korb. 2021. Transcriptomic analyses of the termite, Cryptotermes secundus, reveal a gene network underlying a long lifespan and high fecundity. Communications biology 4: 384.
Liu, H., F. Jiang, S. Wang, H. Wang, A. Wang, H. Zhao, D. Xu, B. Yang, and W. Fan. 2022. Chromosome-level genome of the globe skimmer dragonfly (Pantala flavescens). GigaScience 11: giac009.
Lohman, D. J., M. De Bruyn, T. Page, K. Von Rintelen, R. Hall, P. K. L. Ng, H.-T. Shih, G. R. Carvalho, and T. Von Rintelen. 2011. Biogeography of the Indo-Australian Archipelago. Annu. Rev. Ecol. Evol. Syst. 42: 205-226.
Maddison, W., and D. Maddison. 2007. Mesquite 2. A modular system for evolutionary analysis 3.
Matzke, M. N. J. 2013. Package ‘BioGeoBEARS’.
Matzke, N. J. 2022. Statistical comparison of DEC and DEC+ J is identical to comparison of two ClaSSE submodels, and is therefore valid. J. Biogeogr. 49: 1805-1824.
Minh, B. Q., M. W. Hahn, and R. Lanfear. 2020a. New methods to calculate concordance factors for phylogenomic datasets. Mol. Biol. Evol. 37: 2727-2733.
Minh, B. Q., H. A. Schmidt, O. Chernomor, D. Schrempf, M. D. Woodhams, A. von Haeseler, and R. Lanfear. 2020b. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37: 1530–1534.
Mirarab, S., and T. Warnow. 2015. ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics 31: i44-i52.
Needham, J. G. 1930. Manual of the Dragonflies of China. A monographic study of the Chinese Odonata. Zool. Sinica Ser. A 11: 1-399.
Needham, J. G., M. J. Westfall Jr, and M. L. May. 2000. Dragonflies of North America. Scientific Publishers, Inc.
Nel, A., A. Arillo, and X. Martínez-Delclòs. 1996. New fossil Odonata (Insecta) from the Upper Miocene of France and Spain (Anisoptera and Zygoptera). Neues Jahrbuch für Geologie und Paläontologie-Abhandlungen 167-219.
Olson, D. M., E. Dinerstein, E. D. Wikramanayake, N. D. Burgess, G. V. Powell, E. C. Underwood, J. A. D'amico, I. Itoua, H. E. Strand, and J. C. Morrison. 2001. Terrestrial Ecoregions of the World: A New Map of Life on Earth: A new global map of terrestrial ecoregions provides an innovative tool for conserving biodiversity. Bioscience 51: 933-938.
OpenAI. 2023. ChatGPT (Mar 14 version) [Large language model].
Parham, J. F., P. C. Donoghue, C. J. Bell, T. D. Calway, J. J. Head, P. A. Holroyd, J. G. Inoue, R. B. Irmis, W. G. Joyce, and D. T. Ksepka. 2012. Best practices for justifying fossil calibrations. Syst. Biol. 61: 346-359.
Parvathy, S. T., V. Udayasuriyan, and V. Bhadana. 2022. Codon usage bias. Mol. Biol. Rep. 49: 539-565.
Phillips, M. J., F. Delsuc, and D. Penny. 2004. Genome-scale phylogeny and the detection of systematic biases. Mol. Biol. Evol. 21: 1455-1458.
Price, B. W., M. Winter, S. J. Brooks, N. H. M. G. A. Lab, W. S. I. T. of Life, and D. T. o. L. Consortium. 2022. The genome sequence of the blue-tailed damselfly, Ischnura elegans (Vander Linden, 1820). Wellcome Open Research 7.
Prjibelski, A., D. Antipov, D. Meleshko, A. Lapidus, and A. Korobeynikov. 2020. Using SPAdes de novo assembler. Current protocols in bioinformatics 70: e102.
Ree, R. H., and S. A. Smith. 2008. Maximum Likelihood Inference of Geographic Range Evolution by Dispersal, Local Extinction, and Cladogenesis. Syst. Biol. 57: 4-14.
Ree, R. H., and I. Sanmartín. 2018. Conceptual and statistical problems with the DEC+ J model of founder‐event speciation and its comparison with DEC via model selection. J. Biogeogr. 45: 741-749.
Ronquist, F. 1997. Dispersal-Vicariance Analysis: A New Approach to the Quantification of Historical Biogeography. Syst. Biol. 46: 195-203.
Schädel, M., and G. Bechly. 2016. First record of Anisoptera (Insecta: Odonata) from mid-Cretaceous Burmese Amber. Zootaxa 4103: 537-549.
Schaedel, M., P. Mueller, and J. T. Haug. 2020. Two remarkable fossil insect larvae from Burmese amber suggest the presence of a terminal filum in the direct stem lineage of dragonflies and damselflies (Odonata). Rivista Italiana di Paleontologia e Stratigrafia 126.
Sharma, J., and A. Uddin. 2014. Comparative analysis of codon usage bias between two lepidopteran insect species: Bombyx mandarina and Ostrinia furnacalis. Biotechnology 3.
Simmons, M. P. 2017. Relative benefits of amino‐acid, codon, degeneracy, DNA, and purine‐pyrimidine character coding for phylogenetic analyses of exons. Journal of Systematics and Evolution 55: 85-109.
Simmons, M. P., L.-B. Zhang, C. T. Webb, and A. Reeves. 2006. How can third codon positions outperform first and second codon positions in phylogenetic inference? An empirical example from the seed plants. Syst. Biol. 55: 245-258.
Suvorov, A., N. O. Jensen, C. R. Sharkey, M. S. Fujimoto, P. Bodily, H. M. C. Wightman, T. H. Ogden, M. J. Clement, and S. M. Bybee. 2017. Opsins have evolved under the permanent heterozygote model: insights from phylotranscriptomics of Odonata. Mol. Ecol. 26: 1306-1322.
Tang, X., and D. Wong. FAST-SP: A fast algorithm for block placement based on sequence pair, pp. 521-526. In Proceedings of the 2001 Asia and South Pacific design automation conference, 2001. Association for Computing Machinery, New York, NY, United States.
Terrapon, N., C. Li, H. M. Robertson, L. Ji, X. Meng, W. Booth, Z. Chen, C. P. Childers, K. M. Glastad, and K. Gokhale. 2014. Molecular traces of alternative social organization in a termite genome. Nature communications 5: 3636.
Timmermans, M. J., C. Barton, J. Haran, D. Ahrens, C. L. Culverwell, A. Ollikainen, S. Dodsworth, P. G. Foster, L. Bocak, and A. P. Vogler. 2016. Family-level sampling of mitochondrial genomes in Coleoptera: compositional heterogeneity and phylogenetics. Genome Biol. Evol. 8: 161-175.
Tolman, E. R., C. D. Beatty, J. Bush, M. Kohli, C. M. Moreno, J. L. Ware, K. S. Weber, R. Khan, C. Maheshwari, and D. Weisz. 2023. A Chromosome-length Assembly of the Black Petaltail (Tanypteryx hageni) Dragonfly. Genome Biol. Evol. 15: evad024.
Toussaint, E. F., E. A. Ellis, R. J. Gott, A. D. Warren, K. M. Dexter, C. Storer, D. J. Lohman, and A. Y. Kawahara. 2021a. Historical biogeography of Heteropterinae skippers via Beringian and post‐Tethyan corridors. Zoologica Scripta 50: 100-111.
Toussaint, E. F. A., H. Chiba, M. Yago, K. M. Dexter, A. D. Warren, C. Storer, D. J. Lohman, and A. Y. Kawahara. 2021b. Afrotropics on the wing: phylogenomics and historical biogeography of awl and policeman skippers. Systematic Entomology 46: 172-185.
Toussaint, E. F. A., F. M. S. Dias, O. H. H. Mielke, M. M. Casagrande, C. P. Sañudo-Restrepo, A. Lam, J. Morinière, M. Balke, and R. Vila. 2019a. Flight over the Proto-Caribbean seaway: Phylogeny and macroevolution of Neotropical Anaeini leafwing butterflies. Mol. Phylogenet. Evol. 137: 86-103.
Toussaint, E. F. A., R. Vila, M. Yago, H. Chiba, A. D. Warren, K. Aduse‐Poku, C. Storer, K. M. Dexter, K. Maruyama, D. J. Lohman, and A. Y. Kawahara. 2019b. Out of the Orient: Post‐Tethyan transoceanic and trans‐Arabian routes fostered the spread of Baorini skippers in the Afrotropics. Systematic Entomology 44: 926-938.
Tseng, H.-Y., H. Chiba, D. J. Lohman, S.-H. Yen, K. Aduse-Poku, Y. Ohshima, and L.-W. Wu. 2022. Out of Asia: Intercontinental dispersals after the Eocene-Oligocene transition shaped the zoogeography of Limenitidinae butterflies (Lepidoptera: Nymphalidae). Mol. Phylogenet. Evol. 170: 107444.
Walker, E. M. 1925. The North American dragonflies of the genus Somatochlora. [Toronto] University of Toronto.
Walker, E. M., and P. S. Corbet. 1975. The Odonata of Canada and Alaska: Volume Three, Part III: The Anisoptera–Three Families. University of Toronto Press.
Ware, J. L. 2008. Molecular and morphological systematics of Libelluloidea (Odonata: Anisoptera) and Dictyoptera. Rutgers The State University of New Jersey, School of Graduate Studies.
Warnock, R. C., Z. Yang, and P. C. Donoghue. 2012. Exploring uncertainty in the calibration of the molecular clock. Biol. Lett. 8: 156-159.
White, N. E., M. J. Phillips, M. T. P. Gilbert, A. Alfaro-Núñez, E. Willerslev, P. R. Mawson, P. B. Spencer, and M. Bunce. 2011. The evolutionary history of cockatoos (Aves: Psittaciformes: Cacatuidae). Mol. Phylogenet. Evol. 59: 615-622.
Wighton, D. C., and M. V. Wilson. 1986. The Gomphaeschninae (Odonata: Aeshnidae): new fossil genus, reconstructed phylogeny, and geographical history. Systematic Entomology 11: 505-522.
Woese, C., L. Achenbach, P. Rouviere, and L. Mandelco. 1991. Archaeal phylogeny: reexamination of the phylogenetic position of Archaeoglohus fulgidus in light of certain composition-induced artifacts. Syst. Appl. Microbiol. 14: 364-371.
Yang, Z. 1996. Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol. Evol. 11: 367-372.
Yang, Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24: 1586-1591.
Zheng, D., and E. A. Jarzembowski. 2020. A brief review of Odonata in mid-Cretaceous Burmese amber. International Journal of Odonatology 23: 13-21.