# Supplementary Material and Data ## “Analysis of RNA-seq, DNA target enrichment, and Sanger nucleotide sequence data resolves deep splits in the phylogeny of cuckoo wasps (Hymenoptera: Chrysididae)”. This README and corresponding data refer to: Pauli *et al*. (2020)(Journal XXX, DOI of the article when accepted). >written on Oct 22, 2019 >Corresponding authors. E-mail address: oliver.niehuis@biologie.uni-freiburg.de (O. Niehuis), tpauli.bio@gmail.com (T. Pauli) **Supplementary files and descriptions provided via the Digital Repository Dryad** ### CHECKSUMS A text file containing all checksums for the seven following archives. ### Supplementary Archive S1. **[Supplementary_Archive_S1.tar.gz: 12 MB]** S1_TSA_ARGOCHRYSIS_ARMILLA.fas: transcriptome assembly of *Argochrysis armilla* (FASTA format). ### Supplementary Archive S2. **[Supplementary_Archive_S2.tar.gz: 924 KB]** S2_TARGET_DNA_ENRICHMENT_PROBES.tsv: list of probes used for target DNA enrichment (.tsv format). ### Supplementary Archive S3. **[Supplementary_Archive_S3.tar.gz: 124 MB]** S3_ENRICHMENT_DATA_ASSEMBLIES: this archive contains sequences assembled from target DNA enrichment reads (see Material and methods and Supplementary Table 2) and after removal of contaminants each of the species listed (assembly per species in FASTA format): *Chrysurissa densa, Ceratochrysis perpulchra, Allocoelia capensis, Argochrysis toralis, Caenochrysis doriae, Chrysis fugax, Allocoelia mocsaryi, Chrysis amneris, Chrysis* sp., *Hedychridium planifrons, Chrysis pallidicornis, Chrysis impostor, Omalus* sp., *Chrysis angolensis, Caenochrysis* sp., *Adelphe* sp., *Amisega* sp., *Chrysis* sp., *Holophris* sp., *Omalus* sp., *Chrysis laodamia, Chrysis chrysostigma, Exochrysis* sp., *Gaullea argentina, Hedychridium wahisi, Spinolia unicolor, Ceratochrysis kansensis, Eypris niger, Ipsiura* sp., *Exallopyga* sp., *Chrysis chrysoprasina* ### Supplementary Archive S4. **[Supplementary_Archive_S4.tar.gz: 24 MB]** S4_MSA_OF_INDIVIDUAL_GENES: this archive contains 492 multiple sequence alignments of individual single_copy genes used in supermatrix E. Directory 01_AMINO_ACID_ALIGNMENTS: contains 492 multiple sequence alignments before alignment masking on amino_acid level (FASTA format, \*.fas) (each MSA named according to the name of the ortholog group derived from OrthoDB version 7. Directory 02_NUCLEOTIDE_ALIGNMENTS: this archive contains all corresponding multiple sequence alignments corresponding to directory 01, but on nucleotide level and including all three codon positions (FASTA format, \*.fas). ### Supplementary Archive S5. **[Supplementary_Archive_S5.tar.gz: 226 MB]** S5_SUPERMATRICES_AND_ML_TREES: this archive contains all supermatrices (FASTA format), respective partition schemes including the selected models used for phylogenetic analyses under the ML criterion and trees inferred from ML phylogenetic analyses. File 01_SUPERMATRIX_E_AA_G.fas: supermatrix on amino acid level (FASTA format) comprising all 492 enriched genes which is partitioned based on a gene–based partitioning scheme. File 02_BEST_SCHEME_E_AA_G.nex: best partitioning scheme (NEXUS format) corresponding to File 01 including selected models for each partition (see Materials and methods), selected according to the corrected Akaike information criterion (AICc). File 03_BEST_TREE_E_AA_G.nwk: best ML tree inferred from File 01 and File 02 with bootstrap support from 1,000 non-parametric bootstrap resamplings (NEWICK format). File 04_SUPERMATRIX_E_AA_D.fas: supermatrix on amino acid level (FASTA format) comprising all 492 enriched genes which is partitioned based on a domain–based partitioning scheme. File 05_BEST_SCHEME_E_AA_D.nex: best partitioning scheme (NEXUS format) corresponding to File 04 including selected models for each partition (see Materials and ethods), selected according to the corrected Akaike information criterion (AICc). File 06_BEST_TREE_E_AA_D.nwk: best ML tree inferred from File 04 and File 05 with bootstrap support from 1,000 non-parametric bootstrap resamplings (NEWICK format). File 07_SUPERMATRIX_E_NT_G.fas: supermatrix on nucleotide level (FASTA format) comprising only the second condon positions of all 492 enriched genespartitioned based on a gene–based partitioning scheme. File 08_BEST_SCHEME_E_NT_G.nex: best partitioning scheme (NEXUS format) corresponding to File 07 including selected models for each partition (see Materials and methods),selected according to the corrected Akaike information criterion (AICc). File 09_BEST_TREE_E_NT_G.nwk: best ML tree inferred from File 07 and File 08 with bootstrap support from 1,000 nonparametric bootstrap resamplings (NEWICK format). File 10_SUPERMATRIX_E_NT_D.fas: supermatrix on nucleotide level (FASTA format) comprising only the second condon positions of all 492 enriched genes partitioned based on a domain–based partitioning scheme. File 11_BEST_SCHEME_E_NT_D.nex: best partitioning scheme (NEXUS format) corresponding to File 10 including selected models for each partition (see Materials and methods), selected according to the corrected Akaike information criterion (AICc). File 12_BEST_TREE_E_NT_D.nwk: best ML tree inferred from File 10 and File 11 with bootstrap support from 1,000 nonparametric bootstrap resamplings (NEWICK format). File 13_SUPERMATRIX_0_AA_G.fas: supermatrix on amino_acid level (FASTA format) comprising all 3,260 single–copy orthologs partitioned based on a gene–based partitioning scheme. File 14_BEST_SCHEME_0_AA_G.nex: best partitioning scheme (NEXUS format) corresponding to File 13 including selected models for each partition (see Materials and methods), selected according to the corrected Akaike information criterion (AICc). File 15_BEST_TREE_0_AA_G.nwk: best ML tree inferred from File 13 and File 14 with bootstrap support from 1,000 ultrafast bootstrap resamplings (inferred with -bnni) (NEWICK format). File 16_SUPERMATRIX_T_AA_G.fas: supermatrix on amino_acid level (FASTA format) comprising all 3,260 single–copy orthologs (excluding all enrichment sequence data), partitioned based on a gene–based partitioning scheme. File 17_BEST_SCHEME_T_AA_G.nex: best partitioning scheme (NEXUS format) corresponding to File 16 including selected models for each partition (see Materials and methods), selected according to the corrected Akaike information criterion (AICc). File 18_BEST_TREE_K_AA_G.nwk: best ML tree inferred from File 16 and File 17 with bootstrap support from 1,000 ultrafast bootstrap resamplings (inferred with -bnni) (NEWICK format). File 19_SUPERMATRIX_CONSTRAINT_TREE_SEARCH.fas: supermatrix on nucleotide level (FASTA format) comprising the eleven genes also used in Pauli *et al*. (2019), partitioned based on origin of sequence (i.e. mitochondrial or nuclear-encoded) and codon positions. File 20_BEST_SCHEME_CONSTRAINT_TREE_SEARCH.nex: best partitioning scheme (NEXUS format) corresponding to File 19 including selected models for each partition (see Materials and methods), selected according to the corrected Akaike information criterion (AICc). File 21_BEST_TREE_CONSTRAINT_TREE_SEARCH.nwk: best ML tree inferred from File 19 and File 20, using File 01 as constraint tree with bootstrap support from 1,000 nonparametric bootstrap resamplings (NEWICK format). ### Supplementary Archive S6. **[Supplementary_Archive_S6.tar.gz: 170 KB]** S6_MSA_OF_GENES_FROM_PAULI_ET_AL_2018: this archive contains all multiple sequence alignments (FASTA format, \*.fasta) on nucleotide level of the eleven genes (see Pauli *et al*. 2019, https://doi.org/10.1111/syen.12323) used to infer the constraint tree in Figure 2. The sequences in these files derive either from Sanger_sequenced data, transcriptome data, or DNA target enrichment data. The following genes were used (names in parenthesis are Drosophila melanogaster homologs, see Hartig *et al*., 2012, https://doi.org/10.1371/journal.pone.0039826, Table S1). HOG2941 (tropomodulin) HOG3495 (Imitation SWI) HOG3683 (synaptotagmin 1) HOG4631 (MTA1–like) HOG5119 (Adenosylhomocysteinase like 1) HOG5496 (Glucose transporter 1) HOG5592 (Tousled like kinase) HOG5694 (α–catenin) HOG6343 (schizo) HOG6768 (Clathrin heavy chain) COI (Cytochrome c oxidase subunit I) ### Supplementary Archive S7. **[Supplementary_Archive_S7.tar.gz: 1021 KB]** S7_ASTRAL_RESULTS: this archive contains all results obtained from phylogenetic multi_species coalescent analyses performed with Astral version 5.6.3. File 01_GENE_ORDER: list of genes in order they were processed (PLAIN TEXT format). This only bears relevance for files 02 and 03. File 02_BEST_TREES_PER_GENE.nwk: best ML tree per gene, inferred individually for 492 multiple sequence alignments (NEWICK format). All best ML trees are merged into a singfle file. The order of the ML trees correspond to the genes in the same order as listed in File 01. File 03_BEST_MODEL_PER_GENE: best selected substitution model for each individual multiple sequence alignment (PLAIN TEXT format). File 04_ASTRAL_TREE_POSTERIOR_PROBABILITY.nwk: phylogenetic tree (NEWICK format) inferred with the multi_species coalescent method (Astral version 5.6.3) from single ML gene trees in file 01 with posterior probabilities mapped onto the inferred tree. File 05a_ASTRAL_TREE_MULTILOCUS_BOOTSTRAP.nwk: phylogenetic tree (see File 03, NEWICK format) inferred with the multi–species coalescentmethod (ASTRAL version 5.6.3) and with multi–locus bootstrap support values mapped. File 06_ASTRAL_TREE_QUARTET_SCORES.nwk: phylogenetic tree (see File 03, NEWICK format) with quartet scores (see M&M) mapped.