Data from: De novo transcriptome assembly databases in the butterfly orchid Phalaenopsis equestris

Niu S, Xu Q, Zhang G, Zhang Y, Tsai W, Hsu J, Liang C, Luo Y, Liu Z

Date Published: September 22, 2016

DOI: http://dx.doi.org/10.5061/dryad.8253q

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title P. equestris genome assembly
Downloaded 28 times
Description The P. equestris genome scaffolds and the file containing the locational relationship between the superscaffold and scaffolds or contigs
Download Pha_1213.scafSeq.FG2_superscaffold.tar.gz (306.4 Mb)
Details View File Details
Title P. equestris genome repeat annotation
Downloaded 17 times
Description The P. equestris genome repeat annotation,which containing repeat annotation file by proteinmasker, repeatmasker and TRF, the gff format file of repeat annotation by proteinmasker, repeatmasker and TRF, the gff format file of de novo repeat annotation and the xlsx format file of the statistics of repeat annotation.
Download pequ_repeat_dataset1.tar.gz (187.9 Mb)
Details View File Details
Title P. equestris genome gene models
Downloaded 21 times
Description The P. equestris genome gene models contain predicted coding sequence, proteins and gff format file
Download pequ_gene_models_dataset1.tar (52.94 Mb)
Details View File Details
Title P. equestris genome functional annotation
Downloaded 18 times
Description The P. equestris genome function annotation dataset contains the blast results from KEGG, InterPro, Swissprot, TrEMBL database
Download pequ_functional_annotation_dataset1.tar (68.25 Mb)
Details View File Details
Title The transcriptome assembly
Downloaded 30 times
Description The dataset contains the unigenes from the longest contigs per transcripts generated by Trinity. The fb.flower bud.Unigene.fa file contains unigenes from flower of P. equestris, the L5.root.Unigene.fa file are unigenes from root of P. equestris, the L6.stem.Unigene.fa file contains unigenes from stem of P. equestris, the PHA.leaf. Unigene.fa file contains unigenes from leaf of P. equestris. 12_day.unigene.fasta, 7_day.unigene.fasta and 4_day.unigene.fasta files are unigenes from seeds respectively taken from sowing on 1/2 MS medium for 12 days, 7 days and 4 days. sepal.unigene.fasta, petal.unigene.fasta, lip.unigene.fasta and column.unigene.fasta files are unigenes from sepal, petal, lip and column.
Download unigene_dataset3.tar (383.4 Mb)
Details View File Details
Title The transcriptome functional annotation
Downloaded 39 times
Description The dataset contains functional annotation and gene coding sequence annotation for 11tissues. There are five annotation files per tissues, which are three functional annotation files and two structural annotation files, respectively. They are the KEGG, COG and Nr database annotation files. The cds and pep files are fasta format, the title in the files contains unigene name predicted coding sequence, the locus and the coding direction
Download annotation_dataset4.tar.gz (258.7 Mb)
Details View File Details
Title HSP gene family in the eleven transcriptome
Downloaded 13 times
Description We tested full-length transcripts against the HSP90 and HSP70 gene family in order to examine the completeness of the data by comparing 11 tissues transcriptomes with P. equestris genome. PEQU means P. equestri; flower bud, root, stem and leaf are labeled by fb, L5, L6 and PHA, respectively. 4_day_seed, 7_day_seed and 12_day_seed are seeds respectively taken from sowing on 1/2 MS medium for 4 days, 7 days and 12 days.
Download HSP_dataset5.tar (368.6 Kb)
Details View File Details
Title 100 CEGs for checking transcript assembly completeness
Downloaded 7 times
Description The alignment results from100 randomly selected conserved core eukaryotic genes (CEGs) among Arabidopsis thaliana, P. equestris and eleven transcriptomes for examining the transcript assemblies completeness. 82CEGs sequences (82%) were perfectly reconstructed, showing high consistency, although there were some sequences suggesting that partial sequencing missed in PEQU genome, such as sequences from At2g36880.1 and At1g12840.1 homologous genes, and some sequences in transcriptomes should be merged, such as sequences from At4g39280.1 homologous genes.
Download CEGs_dataset6.tar (675.8 Kb)
Details View File Details

When using this data, please cite the original publication:

Niu S, Xu Q, Zhang G, Zhang Y, Tsai W, Hsu J, Liang C, Luo Y, Liu Z (2016) De novo transcriptome assembly databases in the butterfly orchid Phalaenopsis equestris. Scientific Data 3: 160083. http://dx.doi.org/10.1038/sdata.2016.83

Additionally, please cite the Dryad data package:

Niu S, Xu Q, Zhang G, Zhang Y, Tsai W, Hsu J, Liang C, Luo Y, Liu Z (2016) Data from: De novo transcriptome assembly databases in the butterfly orchid Phalaenopsis equestris. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.8253q
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: