Data from: Transposable element diversity and activity patterns in neotropical salamanders
Data files
May 22, 2025 version files 7.48 GB
-
bchin_teland_final.csv
425.65 MB
-
bflavi_teland_final.csv
641.97 MB
-
bmacri_teland_final.csv
731.27 MB
-
bmex_teland_final.csv
730.20 MB
-
bocci_teland_final.csv
513.59 MB
-
bplaty_teland_final.csv
660.43 MB
-
bruf_teland_final.csv
374.59 MB
-
bstuar_teland_final.csv
815.57 MB
-
bvera_teland_final.csv
322.40 MB
-
byuca_teland_final.csv
476.85 MB
-
cdimi_teland_final.csv
302.31 MB
-
cpris_teland_final.csv
262.23 MB
-
README.md
4.89 KB
-
salamander_telib-v1.0.fasta
162.19 MB
-
tluna_teland_final.csv
273.64 MB
-
tnaris_teland_final.csv
252.04 MB
-
tspil_teland_final.csv
231.68 MB
-
ttlax_teland_final.csv
299.52 MB
Abstract
Transposable elements (TEs) compose a substantial proportion of the largest eukaryotic genomes. TE diversity has been hypothesized to be negatively correlated with genome size, yet empirical demonstrations of such a relationship in a phylogenetic context are largely lacking. Furthermore, the most abundant type of TEs in genomes varies across groups, and it is not clear if there are patterns of TE activity consistent with genome size among different taxa with large genome sizes. We use low-coverage sequencing of 16 species of Neotropical salamanders, which vary approximately seven-fold in genome size, to estimate TE relative abundance and diversity for each species. We also compare divergence of copies of each TE superfamily to estimate patterns of TE activity in each species. We find a negative relationship between TE diversity and genome size, which is consistent with the hypothesis that either competition among TEs or reduced selection against ectopic recombination may result in lower diversity in the largest genomes. We also find divergent activity patterns in the largest vs. the smallest genomes, suggesting that the history of TE activity may explain differences in genome size. Our results suggest that both TE diversity and relative abundance may be predictable, at least within taxonomic groups.
https://doi.org/10.5061/dryad.931zcrjtm
Description of the data and file structure
The data deposited in this repository corresponds to the Transposable Element libraries (sequences), and associated scripts used to describe the relative abundance and diversity of TEs in Neotropical salamanders.
Files and variables
On Zenodo
File: te_landscape_sh.txt
Description: Bash scripts for low coverage sampling and repeat masker.
File: batch_te2.R
Description: R script containing all the code used for analysis and figures used in the present study.
File: te_landscape_experiment.R
Description: R code with a function to generate relative abudances of ransposable elements by species.
On Dryad
File: salamander_telib-v1.0.fasta
Description: Consensus sequences of TEs generated by low coverage assembly of four species of neotropical salamanders, used as a library to determine TE abundance and diversity of other neotropical salamander species.
CSV files containing the transposable element landscapes by species (repeat masker output)
File: tnaris_teland_final.csv
Description: Thorius narisovalis
File: tluna_teland_final.csv
Description: Thorius lunaris
File: ttlax_teland_final.csv
Description: Thorius tlaxiacus
File: tspil_teland_final.csv
Description: Thorius spilogaster
File: cpris_teland_final.csv
Description: Chiropterotriton priscus
File: cdimi_teland_final.csv
Description: Chiropterotriton dimidiatus
File: byuca_teland_final.csv
Description: Bolitoglossa yucatana
File: bvera_teland_final.csv
Description: Bolitoglossa veracruscis
File: bstuar_teland_final.csv
Description: Bolitoglossa stuarti
File: bruf_teland_final.csv
Description: Bolitoglossa rufescens
File: bocci_teland_final.csv
Description: Bolitoglossa occidentalis
File: bmex_teland_final.csv
Description: Bolitoglossa mexicana
File: bmacri_teland_final.csv
Description: Bolitoglossa macrinii
File: bplaty_teland_final.csv
Description: Bolitoglossa platydactyla
File: bflavi_teland_final.csv
Description: Bolitoglossa flaviventris
File: bchin_teland_final.csv
Description: Bolitoglossa chinanteca
All CSV files have the same variables
Variables
- X1: Smith-Waterman alignment score
- X2: K2P percentage divergence
- X3: Percentage of deletions
- X4: Percentage of insertions
- X5: Query start position
- X6: Query end position
- X7: Strand
+: positive strand
C: complementary strand - X8: Reference Consensus sequence id
- X9: Reference Consensus sequence species of origin
- X10: TE class
- X11: TE Order
- X12: TE Superfamily
- NA: Not applicable
- Bel-Pao: Bel-Pao superfamily (LTR retrotransposons)
- CACTA: CACTA superfamily (Class II DNA transposons)
- Copia: Copia superfamily (LTR retrotransposons)
- Crypton: Crypton superfamily (Class II DNA transposons)
- DIRS: DIRS superfamily (LTR retrotransposons)
- ERV: Endogenous retrovirus superfamily (integrated retroviral sequences)
- Gypsy: Gypsy superfamily (LTR retrotransposons)
- hAT: hAT superfamily (Class II DNA transposons)
- Helitron: Helitron superfamily (Class II DNA transposons)
- I: I superfamily (non-LTR retrotransposons, LINE-like)
- Jockey: Jockey superfamily (non-LTR retrotransposons, LINE-like)
- Kolobok: Kolobok superfamily (Class II DNA transposons)
- L1: L1 superfamily (LINE-1, non-LTR retrotransposons)
- Maverick: Maverick superfamily (Class II DNA transposons)
- MITE: Miniature Inverted-repeat Transposable Elements (Class II DNA transposons)
- MuDR: MuDR superfamily (Class II DNA transposons)
- Novosib: Novosib superfamily (Class II transposons)
- Penelope: Penelope-like elements (PLEs; non-LTR retrotransposons)
- PIF-Harbinger: PIF/Harbinger superfamily (Class II DNA transposons)
- PiggyBac: PiggyBac superfamily (Class II DNA transposons)
- R2: R2 superfamily (non-LTR retrotransposons)
- Retrovirus: Retrovirus (exogenous retroviral elements)
- RTE: RTE superfamily (non-LTR retrotransposons)
- SINE: Short Interspersed Nuclear Elements (non-autonomous non-LTR retrotransposons)
- Sola: Sola superfamily (Class II DNA transposons)
- Tc1-Mariner: Tc1-Mariner superfamily (Class II DNA transposons)
- termLTR: TE with terminal Long Terminal Repeat structure (unclassified or incomplete LTR retrotransposon)
- termTIR: TE with terminal Tandem Inverted Repeat structure (unclassified or incomplete DNA transposon)
- " ": Empty fields indicate no super-family could be assigned
Low coverage sequencing genomic libraries were prepared and sequenced at the Vincent J. Coates Sequencing facility, QB3, at the University of California, Berkley, using the NovaSeq S4 seqquencing platform. Four samples were selected to asse,ble draft contigs using dipSPADES (Safanova et al. 2015). Transposable elements were retrieved and annotated using PiRATE (Pipeline to Retrieve and Annotate Transposable Elements).
- Decena-Segarra, Louis Paul; Rovito, Sean M. (2025). Data from: Transposable element diversity and activity patterns in neotropical salamanders. Zenodo. https://doi.org/10.5281/zenodo.14041464
- Decena-Segarra, Louis Paul; Rovito, Sean M. (2025). Data from: Transposable element diversity and activity patterns in neotropical salamanders. Zenodo. https://doi.org/10.5281/zenodo.14041463
- Decena-Segarra, Louis Paul; Rovito, Sean M (2024). Transposable Element Diversity and Activity Patterns in Neotropical Salamanders. Molecular Biology and Evolution. https://doi.org/10.1093/molbev/msae225
