Recent activity in expanding populations and purifying selection have shaped transposable element landscapes across natural accessions of the Mediterranean grass Brachypodium distachyon
Data files
Apr 08, 2021 version files 4.83 GB
Abstract
Transposable element (TE) activity has emerged as a major cause of variation in genome size and structure among species. To what extent TEs contribute to genetic variation and divergence within species, however, is much less clear, mainly because population genomic data have so far only been available for the classical model organisms. In this study, we use the annual Mediterranean grass Brachypodium distachyon to investigate TE dynamics in natural populations. Using whole-genome sequencing data for 53 natural accessions, we identified more than 5,400 TE polymorphisms across the studied genomes. We found, first, that while population bottlenecks and expansions have shaped genetic diversity in B. distachyon, these events did not lead to lineage-specific activations of TE families, as observed in other species. Instead, the same families have been active across the species range and TE activity is homogeneous across populations, indicating the presence of conserved regulatory mechanisms. Second, almost half of the TE insertion polymorphisms are accession-specific, most likely because of recent activity in expanding populations and the action of purifying selection. And finally, although TE insertion polymorphisms are underrepresented in andaroundgenes,more than 1,000 of them occur in genic regions and could thus contribute to functional divergence. Our study shows that while TEs in B. distachyon are “well-behaved”compared with TEs in other species with larger genomes, they are an abundant source of lineage-specific genetic variation and may play an important role in population divergence and adaptation.
Methods
The genomes of the 53 B. distachyon accessions analyzed in this study were recently sequenced at a mean coverage of 74 to create a B. distachyon pan-genome (supplementary table S1, Supplementary Material online; Gordon et al. 2017). For each accession, we aligned the 76 or 100bp Illumina paired- end reads with BWA-MEM (standard settings, Li 2013) to version 2.0 of the B. distachyon reference genome. After removing duplicates with Sambamba (Tarasov et al. 2015), single nucleotide polymorphisms were called with Freebayes (Garrison and Marth 2012). Standard settings were used for both programs. The following filters were applied to the raw Freebayes output: we first removed indels and variants in low complexity regions identified with Dustmasker (Morgulis et al. 2006). VCFtools v0.1.12b (Danecek et al. 2011) was then used to remove SNPs with a quality lower than 30 and amean depth lower than half or higher than twice the genome-wide average depth at all variant sites. This filtered data set with 5,918,789 SNPs will be referred to as the full SNP data set; further filtering steps were adapted to the requirements of the different meth- ods and are described below.