Data from: The role of structural genomic variants in population differentiation and ecotype formation in Timema cristinae walking sticks
Cite this dataset
Lucek, Kay; Gompert, Zach; Nosil, Patrik (2019). Data from: The role of structural genomic variants in population differentiation and ecotype formation in Timema cristinae walking sticks [Dataset]. Dryad. https://doi.org/10.5061/dryad.j4b543t
Abstract
Usage notes
Deletion variants
VCF file with the 194 deletion structural variants found that were identified using Lumpy and Delly. Data for all 20 Timema cristinae individuals are included.
mod_del_genotyped.vcf.gz
Duplication variants
VCF file with the 223 duplication structural variants found that were identified using Lumpy and Delly. Data for all 20 Timema cristinae individuals are included.
mod_dup_genotyped.vcf.gz
Inversion variants
VCF file with the 492 inversion structural variants found that were identified using Lumpy and Delly. Data for all 20 Timema cristinae individuals are included.
mod_lumpy_inversions_genotyped.vcf.gz
SV population genetics script
R script for population genetic analyses and plots of the structural variant data. This includes calculations for Fst.
svSummary.R
SV allele frequencies
This compressed directory includes maximum l likelihood allele frequency estimates for the SVs. There is one file per SV type (inv = inversion, del = deletion, dup = duplication) and population. Files without population IDs are for all individuals together. In each file, there is one row per SV, the first column gives the locus ID, and the third column gives the non-reference SV allele frequency.
svAlleleFreqs.tar.gz
MeasureOrientationFreqs
One of two complementary perl scripts used to identify the inversions from the whole genome comparative alignment.
ExtractOrientInversions
One of two complementary perl scripts used to identify the inversions from the whole genome comparative alignment.
SNP variant file
VCF file with SNPs from the 160 Timema cristinae genomes.
filtered1X_tcr_wgs_variants_x.vcf.gz
SNP allele frequencies
This compressed directory includes maximum l likelihood allele frequency estimates for the SNPs from the 160 genomes. There is one file per population. In each file, there is one row per SNP, the first column gives the locus ID, and the third column gives the non-reference allele frequency.
snpAlleleFreqs.tar.gz
R population genomics script
This R script contains the core analyses of genetic variation within inversions sequences based on SNPs from the 160 Timema cristinae genomes.
popgen.R
Funding
European Research Council, Award: R/129639
Swiss National Science Foundation, Award: P2BEP3_152103