Butterflies and moths constitute some of the most popular and charismatic insects. Lepidoptera include approximately 160 000 described species, many of which are important model organisms. Previous studies on the evolution of Lepidoptera did not confidently place butterflies, and many relationships among superfamilies in the megadiverse clade Ditrysia remain largely uncertain. We generated a molecular dataset with 46 taxa, combining 33 new transcriptomes with 13 available genomes, transcriptomes and expressed sequence tags (ESTs). Using HaMStR with a Lepidoptera-specific core-orthologue set of single copy loci, we identified 2696 genes for inclusion into the phylogenomic analysis. Nucleotides and amino acids of the all-gene, all-taxon dataset yielded nearly identical, well-supported trees. Monophyly of butterflies (Papilionoidea) was strongly supported, and the group included skippers (Hesperiidae) and the enigmatic butterfly–moths (Hedylidae). Butterflies were placed sister to the remaining obtectomeran Lepidoptera, and the latter was grouped with greater than or equal to 87% bootstrap support. Establishing confident relationships among the four most diverse macroheteroceran superfamilies was previously challenging, but we recovered 100% bootstrap support for the following relationships: ((Geometroidea, Noctuoidea), (Bombycoidea, Lasiocampoidea)). We present the first robust, transcriptome-based tree of Lepidoptera that strongly contradicts historical placement of butterflies, and provide an evolutionary framework for genomic, developmental and ecological studies on this diverse insect order.
README.txt
This file contains descriptions of all the files associated with this Dryad package.
LEP1-COS.tar.gz
This compressed directory contains the HaMStR (Ebersberger et al. 2009: http://sourceforge.net/projects/hamstr/) formatted core-ortholog set we compiled to analyze the phylogeny of Lepidoptera in Kawahara and Breinholt (2014). See the file Taxon_list_readme.txt inside the compressed directory and Kawahara and Breinholt (2014) for more information about this core-ortholog set.
taxa_list.txt
taxa_list.txt: List of Sample ID's used in nexus files, species names, Assembly File Name, and Genbank SRA in tab-delimited text
final_soap_SW130206
final_soap_SW130206.fa: Assembly of Nemophora sp. from Genbank SRA accession #SRR1299782, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120096B
final_soap_FG120096B.fa: Assembly of Eacles sp. from Genbank SRA accession #SRR1299435, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120016
final_soap_FG120016.fa: Assembly of Therinia lactucina from Genbank SRA accession #SRR1299418, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120049B
final_soap_FG120049B.fa: Assembly of Adhemarius daphne from Genbank SRA accession #SRR1299394, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_Xter2
final_soap_Xter2.fa: Assembly of Xylophanes tersa from Genbank SRA accession #SRR1298384, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_Callid
final_soap_Callid.fa: Assembly of Pterodecta felderi from Genbank SRA accession #SRR1299369, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120022
final_soap_FG120022.fa: Assembly of Morpheis mathani from Genbank SRA accession #SRR1299214, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SW130197
final_soap_SW130197.fa: Assembly of Dichomeris sp. from Genbank SRA accession #SRR1299773, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SW130007
final_soap_SW130007.fa: Assembly of Thubana sp. from Genbank SRA accession #SRR1300991, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_GNV120029
final_soap_GNV120029.fa: Assembly of Macaria distribuaria from Genbank SRA accession #SRR1299213, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_GNV120032
final_soap_GNV120032.fa: Assembly of Nemoria lixaria from Genbank SRA accession #SRR1299347, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120055B
final_soap_FG120055B.fa: Assembly of Nothus lunus from Genbank SRA accession #SRR1299318, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SW130126
final_soap_SW130126.fa: Assembly of Lyssa zampa from Genbank SRA accession #SRR1299769, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_Pcit2
final_soap_Pcit2.fa: Assembly of Phyllocnistis citrella from Genbank SRA accession #SRR1299751, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120028B
final_soap_FG120028B.fa: Assembly of Aepytus sp. from Genbank SRA accession #SRR1299317, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120070B
final_soap_FG120070B.fa: Assembly of Artace sp. from Genbank SRA accession #SRR1299316, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120046B
final_soap_FG120046B.fa: Assembly of Lacosoma ludolpha from Genbank SRA accession #SRR1299212, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SW130232
final_soap_SW130232.fa: Assembly of Eudocima salaminia from Genbank SRA accession #SRR1300148, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SW130103
final_soap_SW130103.fa: Assembly of Anigraea sp. from Genbank SRA accession #SRR1299755, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SW130224
final_soap_SW130224.fa: Assembly of Manoba major from Genbank SRA accession #SRR1300145, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120015
final_soap_FG120015.fa: Assembly of Notoplusia minuta from Genbank SRA accession #SRR1299746, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120122
final_soap_FG120122.fa: Assembly of Macrosoma sp. from Genbank SRA accession #SRR1299306, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_GNV120020
final_soap_GNV120020.fa: Assembly of Hylephila phyleus from Genbank SRA accession #SRR1299296, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_GNV139000
final_soap_GNV139000.fa: Assembly of Megathymus yuccae from Genbank SRA accession #SRR1299752, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_GNV120025
final_soap_GNV120025.fa: Assembly of Hemiargus ceraunus from Genbank SRA accession #SRR1299274, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120077
final_soap_FG120077.fa: Assembly of Semomesia campanea from Genbank SRA accession #SRR1299211, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SRR850324
final_soap_SRR850324.fa: Assembly of Papilio glaucus from Genbank SRA accession #SRR850324 , using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SRR850327
final_soap_SRR850327.fa: Assembly of Papilio polytes from Genbank SRA accession #SRR850327 , using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_GNV120027
final_soap_GNV120027.fa: Assembly of Lantanophaga pusillidactyla from Genbank SRA accession #SRR1299210, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SRR64791_
final_soap_SRR64791_.fa: Assembly of Cnaphalocrocis medinalis from Genbank SRA accession #SRR647910 - SRR647915 , using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120071B
final_soap_FG120071B.fa: Assembly of Myelobia sp. from Genbank SRA accession #SRR1299267, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_GNV139006
final_soap_GNV139006.fa: Assembly of Pseudothyris sepulchralis from Genbank SRA accession #SRR1299495, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120079
final_soap_FG120079.fa: Assembly of Zeuzerodes maculata from Genbank SRA accession #SRR1299209, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_SRR803483
final_soap_SRR803483.fa: Assembly of Grapholita dimorpha from Genbank SRA accession #SRR803483 , using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_GNV129007
final_soap_GNV129007.fa: Assembly of Urodus parvula from Genbank SRA accession #SRR1299750, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120035
final_soap_FG120035.fa: Assembly of Dalcera abrasa from Genbank SRA accession #SRR1299208, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
final_soap_FG120024
final_soap_FG120024.fa: Assembly of Megalopyge tharops from Genbank SRA accession #SRR1299217, using multiple kmers (13,23,33,43,63) with SOAPdenovo-Trans v1.01. Different Kmer assemblies were combined with cd-hit-est and processed with the fastx toolkit. See Kawahara and Breinholt (2014) for more details.
Kawahara_Breinholt_2014_26taxa_465loci_aa.nex
Kawahara_Breinholt_2014_26taxa_465loci_aa.nex: Nexus file containing 465 genes for 26 taxa. See taxa_list.txt for species names of each taxon, this is an amino acid nexus file with a CHARSET that defines each gene. The data was trimmed using ALICUT/ALISCORE (http://zfmk.de/web/Forschung/Abteilungen/AG_Wgele/Software/Aliscore/index.en.html). Gene names correspond to gene numbers in the LEP1-COS core ortholog database included in this DRAYD package. For further information see text and supplementary tables in Kawahara and Breinholt (2014).
Kawahara_Breinholt_2014_26taxa_465loci_aa.model.txt
Kawahara_Breinholt_2014_26taxa_465loci_aa.model.txt: RAxML formatted partitioning and model file. This partitioning file was used to analyze data in Kawahara_Breinholt_2014_26taxa_465loci_aa.nex. 198 partitions and amino acid models defined by partitionfinder (Lanfear et al 2012). See text and supplementary text in Kawahara and Breinholt (2014) for more details.
Kawahara_Breinholt_2014_26taxa_465loci_Degen_nt12.nex
Kawahara_Breinholt_2014_26taxa_465loci_Degen_nt12.nex: Nexus file containing codon position 1 & 2 for 465 genes for 26 taxa. See taxa_list.txt for species names of each taxon, this is a nucleotide nexus file with a CHARSET that defines each gene that starts with codon position 1. The nucleotide data was trimmed 3 codons at a time with ALICUT/ALISCORE (http://zfmk.de/web/Forschung/Abteilungen/AG_Wgele/Software/Aliscore/index.en.html), synonymous signal removed using degen v1.4 Perl script (http://www.phylotools.com), and the third codon has been removed. Gene names correspond to gene numbers in the LEP1-COS core ortholog database included in this Dryad package. For further information see text and supplementary tables in Kawahara and Breinholt (2014).
Kawahara_Breinholt_2014_26taxa_465loci_Degen_nt12.model.txt
Kawahara_Breinholt_2014_26taxa_465loci_Degen_nt12.model.txt: RAxML formatted partitioning file. This partitioning file was used to analyze data in Kawahara_Breinholt_2014_26taxa_465loci_Degen_nt12.nex. 178 partitions defined by partionfinder (Lanfear et al 2012). See text and supplementary text in Kawahara and Breinholt (2014) for more details.
Kawahara_Breinholt_2014_46taxa_2696loci_Degen_nt12.nex
Kawahara_Breinholt_2014_46taxa_2696loci_Degen_nt12.nex: Nexus file containing codon position 1 & 2 for 2969 genes for 46 taxa. See taxa_list.txt for species names of each taxon, this is a nucleotide nexus file with a CHARSET that defines each gene that starts with codon position 1. The nucleotide data was trimmed 3 codons at a time with ALICUT/ALISCORE (http://zfmk.de/web/Forschung/Abteilungen/AG_Wgele/Software/Aliscore/index.en.html), synonymous signal removed using degen v1.4 Perl script (http://www.phylotools.com), and the third codon has been removed. Gene names correspond to gene numbers in the LEP1-COS core ortholog database included in this Dryad package. For further information see text and supplementary tables in Kawahara and Breinholt (2014).
Kawahara_Breinholt_2014_46taxa_2696loci_Degen_nt12.model.txt
Kawahara_Breinholt_2014_46taxa_2696loci_Degen_nt12.model.txt: RAxML formatted partitioning and model file. This partitioning file was used to analyze data in Kawahara_Breinholt_2014_47taxa_2696loci_Degen_nt12.nex. Two partitions by codon position. See text and supplementary text in Kawahara and Breinholt (2014) for more details.
Kawahara_Breinholt_2014_46taxa_2696loci_aa.nex
Kawahara_Breinholt_2014_46taxa_2696loci_aa.nex: Nexus file containing 2969 genes for 46 taxa. See taxa_list.txt for species names of each taxon, this is a amino acid nexus file with a CHARSET that defines each gene. The data was trimmed using ALICUT/ALISCORE (http://zfmk.de/web/Forschung/Abteilungen/AG_Wgele/Software/Aliscore/index.en.html). Gene names correspond to gene numbers in the LEP1-COS core ortholog database included in this Dryad package. For further information see text and supplementary tables in Kawahara and Breinholt (2014).