An ambitious, yet fundamental goal for comparative biology is to understand the evolutionary relationships for all of life. Yet many important taxonomic groups have remained recalcitrant to inclusion into broader scale studies. Here, we focus on collection of 9 new 454 transcriptome data sets from Ostracoda, an ancient and diverse group with a dense fossil record, which is often under-sampled in broader studies. We combine the new transcriptomes with a new morphological matrix (including fossils) and existing Expressed Sequence Tag (EST), mitochondrial genome, nuclear genome and rDNA data. Our analyses lead to new insights into ostracod and pancrustacean phylogeny. We obtained support for three epic pancrustacean clades that likely originated in the Cambrian: Oligostraca (Ostracoda, Mystacocarida, Branchiura, Pentastomida); Multicrustacea (Copepoda, Malacostraca, Thecostraca); and a clade we refer to as Allotriocarida (Hexapoda, Remipedia, Cephalocarida, Branchiopoda). Within the Oligostraca clade, our results support the unresolved question of ostracod monophyly. Within Multicrustacea, we find support for Thecostraca plus Copepoda, for which we suggest the name Hexanauplia. Within Allotriocarida, some analyses support the hypothesis that Remipedia is the sister taxon to Hexapoda, but others support Brachiopoda+Cephalocarida as the sister group of hexapods. In multiple different analyses, we see better support for equivocal nodes using slow-evolving genes or when excluding distant outgroups, highlighting the increased importance of conditional data combination in this age of abundant, often anonymous data. Yet, when we analyze the same set of species and ignore rate of gene evolution, we find higher support when including all data, more in line with a ‘total evidence’ philosophy. By concatenating molecular and morphological data, we place pancrustacean fossils in the phylogeny, which can be used for studies of divergence times in Pancrustacea, Arthropoda, or Metazoa. Our results and new data will allow for attributes of Ostracoda, such as its amazing fossil record and diverse biology, to be leveraged in broader scale comparative studies. Further, we illustrate how adding extensive next-generation sequence data from understudied groups can yield important new phylogenetic insights into long-standing questions, especially when carefully analyzed in combination with other data.
Conchoecissa sequence data
Assembled 454 transcriptome data, otherwise unpublished. Only sequences used in MBE-published phylogenetic analysis are provided here. Collection information: Trawl Lower Sur Canyon on R/V Western Flyer Trawl on Western Flyer 36.06° N 122.29 W December 10th, 2009
Conchoecissa_sp.fasta
Morphological matrix for phylogenetics
Morhpological data matrix, including extant and fossil taxa. Stored in Morphobank, project ID 689
Raw 454: Vestalenula_sp
Raw Transcriptome sequence from full bodies. Adapter sequences included. Pooled together multiple individuals. Adapter = AAGCAGTGGTATCAACGCAGAGTACTTTTTTCTTTTTT. Collection Information: Freshwater Puddle, Isla Colon, Bocas del Toro, Panama. Net collecting 9º21.17'N 82º15.45'W July 29th, 2009 10cm
Vestalenula_sp.fasta
Raw 454: Vestalenula_sp quality file
Quality associated with fasta file.
Vestalenula_sp.fasta.qual
Raw 454: Euphilomedes_morini_Compound_Eyes
Euphilomedes morini. 454 transcriptome data. from compound eyes. Pooled ~50 animals together (100 eyes). Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Stern’s Wharf Pier, Santa Barbara, CA Eckman grab 34º24.4'N 119º40.5'W Oct., Nov., 2008 10 m depth.
Euphilomedes_morini_Compound_Eyes.fasta
Raw 454: Euphilomedes_morini_Compound_Eyes.fasta
Quality file associated with fasta sequence file
Euphilomedes_morini_Compound_Eyes.fasta.qual
Raw 454: Euphilomedes_morini median eye transcriptome
Euphilomedes morini. 454 transcriptome data from median eyes. Pooled ~50 animals together (50 eyes). Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Stern’s Wharf Pier, Santa Barbara, CA Eckman grab 34º24.4'N 119º40.5'W Oct., Nov., 2008 10 m depth.
Euphilomedes_morini_Median_Eyes.fasta
Raw 454: Euphilomedes_morini_Median_Eyes quality
Quality file associated with fasta sequence file
Euphilomedes_morini_Median_Eyes.fasta.qual
Raw 454: Cytherelloidea_californica
Cytherelloidea californica. 454 transcriptome data. from full bodies. Pooled 26 animals together (50 eyes). Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Camino de la Costa Beach Access, La Jolla, San Diego algae collecting 24º46.9'N 80º54.58'W May 14th, 2010 intertidal only on very low tide
Cytherelloidea_californica.fasta
Raw 454: Cytherelloidea_californica quality file
Quality file associated with fasta sequence file
Cytherelloidea_californica.fasta.qual
Raw 454: Puriana_sp_Bodies
Puriana sp. 454 transcriptome data. from Bodies minus median eyes. Pooled multiple animals together. Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Isla Colon, Bocas del Toro, Panama net collecting 9º21'N 82º15.45'W July 23rd, 2009 1m depth
Puriana_sp_Bodies.fasta
Raw 454: Puriana_sp_Median_Eyes quality file
Quality file associated with fasta sequence file
Puriana_sp_Median_Eyes.fasta.qual
Raw 454: Puriana_sp_Median_Eyes
Puriana sp. 454 transcriptome data. from Median Eyes. Pooled multiple animals together. Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Isla Colon, Bocas del Toro, Panama net collecting 9º21'N 82º15.45'W July 23rd, 2009 1m depth
Puriana_sp_Median_Eyes.fasta
Raw 454: Puriana_sp_Bodies quality file
Quality file associated with fasta sequence file
Puriana_sp_Bodies.fasta.qual
Raw 454: Heterocypris_sp_Bodies
Heterocypris sp. 454 transcriptome data. bodies minus eyes. Pooled 30 animals together. Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Temporary Freshwater Pool, More Mesa, Santa Barbara, CA net collecting 34º25.23'N 119º47.29'W 10cm depth
Heterocypris_sp_Bodies.fasta
Raw 454: Heterocypris_sp_Bodies Quality file
Quality file associated with fasta sequence file
Heterocypris_sp_Bodies.fasta.qual
Raw 454: Heterocypris_sp_Median_Eyes
Heterocypris sp. 454 transcriptome data median eyes. Pooled 100 eyes together Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Temporary Freshwater Pool, More Mesa, Santa Barbara, CA net collecting 34º25.23'N 119º47.29'W 10cm depth
Heterocypris_sp_Median_Eyes.fasta
Raw 454: Heterocypris_sp_Median_Eyes quality file
Quality file associated with fasta sequence file
Heterocypris_sp_Median_Eyes.fasta.qual
Raw 454: Skogsbergia_lerneri_Compound_Eyes
Skogsbergia lerneri. 454 transcriptome data. compound eyes (more than 50 pooled together). Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Duck Key Viaduct, FL. bait trap. 24º46.9'N 80º54.58'W. July 16th thru July 18th, 2009. 2-3m depth.
Skogsbergia_lerneri_Compound_Eyes.fasta
Raw 454: Skogsbergia_lerneri_Compound_Eyes quality file
Skogsbergia_lerneri_Compound_Eyes.fasta.qual
Raw 454: Skogsbergia_lerneri_Median_Eyes
Skogsbergia lerneri. 454 transcriptome data. median eyes (more than 50 pooled together). Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Duck Key Viaduct, FL. bait trap. 24º46.9'N 80º54.58'W. July 16th thru July 18th, 2009. 2-3m depth.
Skogsbergia_lerneri_Median_Eyes.fasta
Raw 454: Skogsbergia_lerneri_Median_Eyes quality file
Quality file associated with fasta sequence file
Skogsbergia_lerneri_Median_Eyes.fasta.qual
Raw 454: Vargula_tsujii_Body1
Vargula tsujii. 454 transcriptome data. 33 Bodies minus eyes. Two separate runs of 454 from same RNA. Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Fishermen’s Cove, Twin Harbors, Catalina Island, CA bait trap 33º26.66'N 118º29.34'W July 10th and 11th, 2009 Depth: 5-10m
Vargula_tsujii_Body1.fasta
Raw 454: Vargula_tsujii_Body1 quality file
Quality file associated with fasta sequence file
Vargula_tsujii_Body1.fasta.qual
Raw 454: Vargula_tsujii_Body2
Vargula tsujii 454 transcriptome data. 33 Bodies minus eyes. Two separate runs of 454 from same RNA. Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Fishermen’s Cove, Twin Harbors, Catalina Island, CA bait trap 33º26.66'N 118º29.34'W July 10th and 11th, 2009 Depth: 5-10m
Vargula_tsujii_Body2.fasta
Raw 454: Vargula_tsujii_Body2.fasta
Quality file associated with fasta sequence file
Vargula_tsujii_Body2.fasta.qual
Raw 454: Actinoseta_jonesi
Actinoseta jonesi 454 transcriptome data. Full bodies including eyes. Multiple pooled individuals. Raw sequence reads, including adaptors. Adaptor: AAGCAGTGGTATCAACGCAGAGT. Collection information: Cayo Enrique, La Parguera, Puerto Rico net collecting 17°57.335’N 67°03.185’W September 12th, 2010 2-3m depth.
Actinoseta_jonesi.fasta
Raw 454: Actinoseta_jonesi quality files
Quality file associated with fasta sequence file
Actinoseta_jonesi.fasta.qual
Data used in phylogenetic analyses
Data used in phylogenetic analyses, in column separated text format. First column is species name, second column is name of data partition, third column is a unique id for the species/partition data (ie GenBank GI number), fourth column is sequence or morphological data. The sequence data are derived from the raw 454 data, or from other sources. Raw 454 data was assembled using Newbler. We pulled genes from transcriptomes using Hamstr. We then aligned sequences with MUSCLE, and cut out ambiguously aligned regions with AliScore/AliCut. Unique identifiers with "Voucher" in the name are previously unpublished, and will be placed in GenBank. This 4-column data format can be used with many tools developed in the Osiris package for conducting phylogenetics in Galaxy.
FullData.csv