Data from: A phylotranscriptomic analysis of gene family expansion and evolution in the largest order of pleurocarpous mosses (Hypnales, Bryophyta)

Johnson MG, Malley C, Goffinet B, Jonathan Shaw A, Wickett NJ

Date Published: February 9, 2016

DOI: http://dx.doi.org/10.5061/dryad.475g7

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title mosstrap-masked-coding.tar
Downloaded 9 times
Description RNA-Seq was conducted on Illumina Hi-Seq 2000 platform, and reads were quality trimmed using Trimmomatic. De novo transcriptome assemblies were generated using Trinity, which can produce many isoforms. We used a phylogenetic approach to decide whether to retain isoforms. Transcripts which contained a valid protein with a BLASTp hit to a known land plant proteome were clustered using mcl, and gene trees were built from sequences in each cluster. To generate the "masked" dataset, the gene trees were searched for monophyletic clades containing sequences from the same species. Only one transcript per clade was retained, and the others were pruned from further analysis. This archive contains a multi-FASTA file of coding (DNA) sequences for each of the transcriptomes generated for the study.
Download mosstrap-masked-coding.tar.gz (159.3 Mb)
Details View File Details
Title masked_orthogroup_table
Downloaded 11 times
Description This file contains a correspondence between assembled transcripts and the orthogroups generated by the Yang/Smith Pipeline (see manuscript for details). Each species is separated by a tab, and transcripts from the same species in the same orthogroup are separated by commas.
Download masked_orthogroup_table.txt (14.86 Mb)
Details View File Details
Title singlegene.tar
Downloaded 7 times
Description This archive contains single gene sequence alignments for 657 "1-to-1 orthologs" used go generate a species phylogeny for the study. Each directory contains the single-gene sequence alignment in multi-FASTA format, and the results of a maximum likelihood search in RAXML. Nucleotide sequences were aligned with MAFFT and trimmed with trimal to remove columns represented by fewer than 5 taxa. Phylogenies were reconstructed with the GTRGAMMA substitution model and nodal support was evaluated with 200 "fast bootstrap" replicates.
Download singlegene.tar.gz (11.66 Mb)
Details View File Details
Title supermatrix_cds_bootstrap_ML
Downloaded 9 times
Description Phylogeny reconstructed from concatenated nucleotide matrix of single genes in RAXML version 8. A single GTRGAMMA partition was specified for each locus. Support values are from 200 full bootstrap replicates.
Download supermatrix_cds_bootstrap_ML.tre (16.03 Kb)
Details View File Details
Title species_tree_astral_jackknife_cds
Downloaded 6 times
Description ASTRAL species tree reconstructed from nucleotide RAXML gene trees for 657 genes. Support values generated by jack-knife (sampling (without replacement) 65 gene trees 1000 times).
Download species_tree_astral_jackknife_cds.tre (21.55 Kb)
Details View File Details
Title cluster_gocats
Downloaded 11 times
Description Gene ontology (GO) annotations for each orthogroup generated from the masked dataset. The Trinotate transcriptome annotation pipeline was used go assign GO annotations to transcripts using BLASTp searches of the UniProt database and HMM searches of the pfam protein domain database. All unique GO annotations for sequences in an orthogroup are shown.
Download cluster_gocats.txt (2.976 Mb)
Details View File Details
Title sequenceID_to_speciesID
Downloaded 12 times
Description Correspondence between sequenceIDs used in data files and the family, genus, and species names of samples used in the analysis.
Download sequenceID_to_speciesID.txt (1.09 Kb)
Details View File Details

When using this data, please cite the original publication:

Johnson MG, Malley C, Goffinet B, Jonathan Shaw A, Wickett NJ (2016) A phylotranscriptomic analysis of gene family expansion and evolution in the largest order of pleurocarpous mosses (Hypnales, Bryophyta). Molecular Phylogenetics and Evolution 98: 29–40. http://dx.doi.org/10.1016/j.ympev.2016.01.008

Additionally, please cite the Dryad data package:

Johnson MG, Malley C, Goffinet B, Jonathan Shaw A, Wickett NJ (2016) Data from: A phylotranscriptomic analysis of gene family expansion and evolution in the largest order of pleurocarpous mosses (Hypnales, Bryophyta). Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.475g7
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: