Data from: Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence

Neme R, Tautz D

Date Published: February 12, 2016

DOI: http://dx.doi.org/10.5061/dryad.8jb83

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title Genome coverage of the mm10 mouse reference genome of four closely related species
Downloaded 19 times
Description Genome coverage of the mm10 mouse reference genome. Computed from genomic alignments of Apodemus uralensis, Mus mattheyi, Mus spicilegus, and Mus spretus. The first six fields correspond to the genomic location and the following four to each of the species mentioned here, in the same order. Features were generated with bedtools, converted into SAF format, and extracted from BAM alignments using the featureCounts suite.
Download 200b_win.features.genomes.norm.ind.zip (169.1 Mb)
Details View File Details
Title Transcriptome coverage of the mm10 mouse reference genome across 200bp windows in ten closely related taxa and three tissues.
Downloaded 19 times
Description Transcriptome coverage of the mm10 mouse reference genome. Computed from transcriptome alignments of three populations of Mus musculus domesticus (AH from Iran, CB from Germany, MC from France), two populations of Mus musculus musculus (KH from Kazakhstan, WI from Austria), Mus musculus castaneus (TA), Mus spicilegus (SC), Mus spretus (SP), Mus mattheyi (MA) and Apodemus uralensis (AP). The first six fields correspond to the genomic location and the following each of the transcriptomes of the species mentioned here. Brain samples (pbrain), liver samples (pliver) and testis samples (ptestis), correspond to sequencing done at approximately one third of an Illumina HiSeq 2000 lane per taxon, while additional brain samples (xbrain) were done in a whole illumina HiSeq 2000 lane per taxon. Features were generated with bedtools, converted into SAF format, and extracted from BAM alignments using the featureCounts suite.
Download pop.200b.windows.zip (267.8 Mb)
Details View File Details
Title rarefaction.200b.windows
Downloaded 10 times
Description Transcriptome coverage of the mm10 mouse reference genome. Computed from transcriptome alignments of three populations of Mus musculus domesticus (AH from Iran, CB from Germany, MC from France), two populations of Mus musculus musculus (KH from Kazakhstan, WI from Austria), Mus musculus castaneus (TA), Mus spicilegus (SC), Mus spretus (SP), Mus mattheyi (MA) and Apodemus uralensis (AP). This is a summary over three tissues (brain, liver, testis) for each of the taxa, resampled to obtain coverage rarefaction estimates by taxon and by fraction of data sequenced. Number in columns indicates the percentage, with total representing the maximum available sampling for each taxon. Each of the rows on this file corresponds to the rows of the transcriptome file, present together in this submission, and must be analyzed together to obtain genomic position information. Features were generated with bedtools, converted into SAF format, and extracted from BAM alignments using the featureCounts suite.
Download rarefaction.200b.windows.zip (350.1 Mb)
Details View File Details
Title resampling_brain.200b.windows
Downloaded 12 times
Description Alignments of extensive sequencing of Brain samples (~320 million reads) were split into three different sets of 100 million reads per taxon, such that each set would contain sets of independent observations. Pair-relationships were maintained, so that pairs of the same fragments would be in the same set. Here we report the quantification per window of each of those resampled transcriptome sets. Coverage of the mm10 mouse reference genome. Computed from transcriptome alignments of three populations of Mus musculus domesticus (AH from Iran, CB from Germany, MC from France), two populations of Mus musculus musculus (KH from Kazakhstan, WI from Austria), Mus musculus castaneus (TA), Mus spicilegus (SC), Mus spretus (SP), Mus mattheyi (MA) and Apodemus uralensis (AP). The first six fields correspond to the genomic location and the following each of the transcriptomes of the species mentioned here. Features were generated with bedtools, converted into SAF format, and extracted from BAM alignments using the featureCounts suite.
Download resample.200b.windows.zip (193.5 Mb)
Details View File Details
Title TaxonomicRestrictedExpressionWindows
Downloaded 11 times
Description Multiple files corresponding to windows with expression above 50 reads in one taxon and absent in all others. Most regions represent only a single taxon, with the exception of those defined for Mus musculus musculus and Mus musculus domesticus populations, in which windows could be present in at least one population, but could also be present in more than one population, provided they would be absent in any other regions. Taxon codes as indicated in the main body of the manuscript. Tissue samples correspond to brain (B), liver (L), and testis (T), and to additional extensive sequencing of brain samples (UDS). Files are in bigWig format, for visualization together with the mm10 version of the mouse reference genome. We provide two IGV (Integrative Genomics Viewer) sessions XML files, which the user can directly load onto the genome browser. One session uses local files, and files have to be present in the same directory as the session file, another makes use of existing files in our local ftp server and does not require the local files, but does require internet connection. In addition to this we provide the expression values supporting the taxonomically-restricted status (*.dat), and a bed file of the relevant regions.
Download TaxonomicRestrictedExpressionWindows.zip (6.438 Mb)
Details View File Details

When using this data, please cite the original publication:

Neme R, Tautz D (2016) Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. eLife 5: e09977. http://dx.doi.org/10.7554/eLife.09977

Additionally, please cite the Dryad data package:

Neme R, Tautz D (2016) Data from: Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.8jb83
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: