Data from: Genomic dynamics of transposable elements in the Western Clawed Frog (Silurana tropicalis)

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title Logistic regression input and related R script, description of TE familes
Downloaded 24 times
Description Supplementary Material: Description of Supplementary Files. We include 22 files as supplementary material, including (a) all TE and non-TE repeat fragments in the Silurana tropicalis genome, as reported by RepeatMasker (b) input files to run logistic regression in R, for the models where TEs are not included in the GC calculations (c) input files to run logistic regression in R, for the model where TEs are included in the GC calculations (d) R script to run both logistic regression models (e) descriptions of TE and non-TE repeat classes (f) description of TE families. (a) 2012RepeatMasked: this is the standard output file from RepeatMasker. Columns include: SW score, percent div. (percent diverged), percent del. (percent deletion), percent ins., begin position in query, end position in query, TE family of matching repeat, repeat class/family, begin and end position in repeat, and RepeatMasker assigned ID number. (b)Input files contain information on the various genomic features and the presence of TE or non-TE repeats in 2 kilobase windows. All windows are ordered in the same manner across the different input files for the same model (model for GC calculated including TE and non-TE repeats or not). Because number of windows in the 2 models are different (we have to exclude windows that are completely full of TE from the model of GC content calculated without including TEs), 2 sets of input files are used. a. TENoTENotContam.csv. This file denotes the presence or absence of TE and non-TE repeats with a 1 or 0. The columns are of different TE or non-TE repeat classes and the rows of are of the presence or absence of these repeats in 2 kilobase windows. b. The fourth column of ConservedNoTE.txt, exonsNoTE.txt, intronsNoTE.txt and GCNoTE.txt lists the proportion of window that is conserved across species, is exon, is intron, and percent GC content respectively. The first 3 columns are “linkage group or not” (whether we concatenated different chromosomes into a linkage group “LG” or not “nonLG”), “linkage group number or chromosome number”, “genomic window number”. The total c. distancesNoTE.txt lists the distance, in basepairs, of the closest gene up or downstream in columns 4 and 5 respectively. expressionNoTE.txt lists the proportion of windows that is expressed in germline genes or soma genes in columns 4 and 5 respectively. The first 3 columns are again “linkage group or not” (whether we concatenated different chromosomes into a linkage group “LG” or not “nonLG”), “linkage group number or chromosome number”, “genomic window number”. (c) This pattern of input files is repeated for the model where percent GC content in a window is calculated including TE and non-TE repeats. The files are similarly named as TENotContam.csv, Conserved.txt, exons.txt, introns.txt, GC.txt, distances.txt and expression.txt. (d) For long TEs and short TEs, separate files are provided for the presence or absence of TE in genomic windows, where GC content is calculated either including or excluding TEs and non-TE repeats. These files are titled “TELong.csv”,”TELongNoTE.csv”, “TEshort.csv”, “TEShortNoTE.csv”. (e) modelTE.R is the script used to read the input files and run the logistic regressions. (f) Summary of classes.xlsx is the summary of total length and number of fragments of different classes of TEs and non-TE Repeat (g) Summary of Families.xlsx is the summary of total TE or non-TE repeat fragments in each TE family or non-TE Repeat
Download suppMat.zip (104.1 Mb)
Details View File Details

When using this data, please cite the original publication:

Shen JJ, Dushoff J, Bewick AJ, Chain FJJ, Evans BJ (2013) Genomic dynamics of transposable elements in the Western Clawed Frog (Silurana tropicalis). Genome Biology and Evolution 5(5): 998-1009. http://dx.doi.org/10.1093/gbe/evt065

Additionally, please cite the Dryad data package:

Shen JJ, Dushoff J, Bewick AJ, Chain FJJ, Evans BJ (2013) Data from: Genomic dynamics of transposable elements in the Western Clawed Frog (Silurana tropicalis). Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.76487
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: