Data from: Challenges and strategies in transcriptome assembly and differential gene expression quantification. A comprehensive in silico assessment of RNA-seq experiments.

Vijay N, Poelstra JW, Künstner A, Wolf JBW

Date Published: August 13, 2012

DOI: http://dx.doi.org/10.5061/dryad.3t3n7

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title Polishing the reference genome
Downloaded 48 times
Description In the first step, Ns in the reference genome were substituted by one of the four bases randomly using perl's built in random number generator.
Download 1_referencepolish.pl.pl (880 bytes)
Details View File Details
Title Simulate raw sequencing data - Step 1
Downloaded 64 times
Description In the second step, the reference genome is used along with expression profiles (exponential or uniform) to create fastq files using the program dwgsim version 0.1.2.
Download 2a_simulate_reads_expression.pl (3.274 Kb)
Details View File Details
Title Simulate raw sequencing data - Step 2
Downloaded 40 times
Description The 2b_gnrCounts.R script generates per gene and per library expression levels for 2x10 libraries with differential expression among them. These read count files are used along with the fastq files simulated in previous step by 2c_drawreads.pl to create 2x10 libraries with differential expression among them.
Download 2b_gnrCounts.R (3.67 Kb)
Details View File Details
Title Simulate raw sequencing data - Step 3
Downloaded 51 times
Download 2c_drawreads.pl (1.685 Kb)
Details View File Details
Title De novo assembly
Downloaded 60 times
Description In the next step, De novo assemblies need to be performed as per the computational infrastructure available. We provide an example commandline used by us.
Download 3_denovo.sh (473 bytes)
Details View File Details
Title Mapping assembly
Downloaded 47 times
Description The mapping assembly first requires mapping the reads onto the reference using stampy read mapper. The sam/bam file obtained after the mapping step is sorted and used to call the consensus sequence. Example commandline used by us can be modified suitably.
Download 4_mapping.sh (724 bytes)
Details View File Details
Title Compare de novo assemblies
Downloaded 42 times
Description The quality of de novo assemblies can be evaluated using the scripts available on the GAGA website when alternative splicing is not being simulated. Step 5 consists of 7 sub-steps, each with its own script.
Download 5_compare_denovo.sh (2.876 Kb)
Details View File Details
Title Compare de novo assemblies - Steps 1 to 7
Downloaded 61 times
Download 5_substeps.zip (4.866 Kb)
Details View File Details
Title Compare mapping assemblies - Mapping 1
Downloaded 41 times
Description The quality of the mapping assemblies obtained after the consensus step is be evaluated.
Download 6a_compare_mapping1.pl (4.072 Kb)
Details View File Details
Title Compare mapping assemblies - Mapping 2
Downloaded 32 times
Description The quality of the mapping assemblies obtained after the consensus step is be evaluated.
Download 6b_compare_mapping2.pl (4.071 Kb)
Details View File Details
Title Differential expression analysis - baySeq
Downloaded 46 times
Description Differential expression analysis across the 2x10 libraries is performed using the Bioconductor package baySeq.
Download 7a_bayseq.R (5.16 Kb)
Details View File Details
Title Differential expression analysis - edgeR
Downloaded 58 times
Description Differential expression analysis across the 2x10 libraries is performed using the Bioconductor packages edgeR
Download 7b_edgeR.R (5.369 Kb)
Details View File Details
Title Evaluation of differential expression analysis
Downloaded 48 times
Description This script performs some evaluation of the output from the differential expression analyses, e.g. it generates FPR's and TPR's.
Download 8_DEevaluation.R (9.059 Kb)
Details View File Details
Title Create data for zebra finch.
Downloaded 35 times
Description Steps 1 and 2 are accomplished by running 1_2_create_datasets.sh for the zebra finch dataset.
Download 1_2_create_datasets.sh (5.286 Kb)
Details View File Details
Title ZebraFinchData
Downloaded 61 times
Description Zebra finch CDS downloaded from Biomart (Ensembl Version 61) with 100 bp 5' UTR and 400 bp 3' UTR per gene.
Download ZebraFinchData.zip (11.71 Mb)
Details View File Details
Title Distributions
Downloaded 55 times
Description Simuated read distributions per expression category.
Download distr.zip (3.181 Kb)
Details View File Details

When using this data, please cite the original publication:

Vijay N, Poelstra JW, Künstner A, Wolf JBW (2012) Challenges and strategies in transcriptome assembly and differential gene expression quantification. A comprehensive in silico assessment of RNA-seq experiments. Molecular Ecology 22(3): 620–634. http://dx.doi.org/10.1111/mec.12014

Additionally, please cite the Dryad data package:

Vijay N, Poelstra JW, Künstner A, Wolf JBW (2012) Data from: Challenges and strategies in transcriptome assembly and differential gene expression quantification. A comprehensive in silico assessment of RNA-seq experiments. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.3t3n7
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: