Data from: MITE annotation and landscape in 207 plant genomes reveal their evolutionary dynamics and functional roles
Data files
May 05, 2026 version files 5.60 MB
-
MITE_seed_sequence_207genomes.tar.gz
5.40 MB
-
MITE-derived-miRNA-hairpin.fa
168.83 KB
-
MITE-derived-miRNA-mature.fa
36.32 KB
-
README.md
762 B
Abstract
Miniature inverted-repeat transposable elements (MITEs) are short, non-autonomous class II transposable elements prevalent in eukaryotic genomes, contributing to various genomic and genic functions in plants. However, research on MITEs mainly targets a few species, limiting a comprehensive understanding and systematic comparison of MITEs in plants. Here, we developed a highly sensitive MITE annotation pipeline with a low false positive rate and applied it to 207 high-quality plant genomes. We found over a 20,000-fold variation in MITE copy numbers among species, with Gnetum montanum harboring the most. The Mutator superfamily is widespread, comprising about 41.5%, whereas Tc1/Mariner and PIF/Harbinger superfamilies expanded rapidly in monocots, particularly in Poaceae. The analysis of MITE insertion times revealed an expansion around 30 million years ago (Mya), peaking at 9−10 Mya, with some species showing another ancient, slower expansion. In three representative families, we identified much more species-specific MITE insertion sites than shared orthologous ones, underscoring MITEs' significant role in genome diversity. Phylogenomic analyses indicate that MITEs accumulated gradually and specifically during speciation, primarily through new insertions, with duplication as a secondary mechanism. MITEs tend to insert near genes, particularly within 500−1000 bp of their flanking regions, and often enhance gene expression. These insertions are associated with diverse gene functions, mainly in transport, synthesis, and metabolism. Furthermore, we identified 985 MITE-derived miRNAs from 392 families across 56 species, mainly from Mutator, Tc1/Mariner, and PIF/Harbinger, targeting a variety of gene functions. This study enhances our understanding of the evolution and functional roles of MITEs in plants and provides a basis for exploring their function in further research.
Dataset DOI: 10.5061/dryad.bg79cnpns
Description of the data and file structure
These are the seed sequences of MITE data derived from the analysis of 207 species, as well as the hairpin and mature sequences of MITE-derived miRNAs.
Files and variables
File: MITE_seed_sequence_207genomes.tar.gz
Description: The seed sequence of MITEs from 207 plant genomes
File: MITE-derived-miRNA-hairpin.fa
Description: The hairpin sequences of MITE-derived miRNAs
File: MITE-derived-miRNA-mature.fa
Description: The mature sequences of MITE-derived miRNAs
Code/software
Linux
