Data from: Natural selection and genetic diversity in the butterfly Heliconius melpomene

Martin SH, Möst M, Palmer WJ, Salazar C, McMillan WO, Jiggins FM, Jiggins CD

Date Published: April 22, 2016

DOI: http://dx.doi.org/10.5061/dryad.g0874

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title genotyping_summaries
Downloaded 8 times
Description [4 files] Genotyping summaries (numbers of genotyped sites, heterozygosity etc.) for all samples. Summaries are provided for all sites, and also for codon positions 1, 2 and 3.
Download genotyping_summaries.tar.gz (8.588 Kb)
Details View File Details
Title structure results
Downloaded 5 times
Description [4 files] Results from STRUCTURE analyses, with k = 5,6,7 and 8.
Download structure.tar.gz (3.036 Kb)
Details View File Details
Title PCA results
Downloaded 7 times
Description Results of Eigenstrat PCA analysis.
Download mel58.Zupdated.realign80.GQ30.4D.min50.min....evec (9.228 Kb)
Details View File Details
Title dxy (absolute divergence) for 100kb windows
Downloaded 8 times
Description [8 files] Absolute divergence between all pairs of taxa for 100kb windows. Eight types of sites were considered: All sites, intergenic, intronic, codon positions 1, 2, and 3, 4D sites, and 4D sites in low-codon-usage-bias genes.
Download dxy.tar.gz (2.151 Mb)
Details View File Details
Title Heterozygosity in windows
Downloaded 1 time
Description [2 files] heterozygosity (here called "indPi") calculated for non-overlapping 100kb windows. One file corresponds to all samples and the other to two selected Panamanian samples: The inbred reference strain, and two outbred individuals.
Download heterozygosity.tar.gz (408.6 Kb)
Details View File Details
Title Linkage Disequilibrium
Downloaded 3 times
Description [3 files] Linkage disequalibrium data for the Eastern and Western populations. "Background LD" refers to that between unlinked SNPs on different chromosomes. "top100" refers to LD calculated for all SNP pairs on the same scaffold, averaged over the top 100 scaffolds.
Download LD.tar.gz (2.19 Kb)
Details View File Details
Title Codon Usage
Downloaded 4 times
Description Effective number of codons and GC content at the third codon position for each gene.
Download final.codonW_with_positions.csv.gz (176.0 Kb)
Details View File Details
Title Alpha
Downloaded 4 times
Description Estimates of alpha, the genome-wide proportion of adaptive substitutions, at different minor allele frequency thresholds.
Download alpha_by_threshold.csv (453 bytes)
Details View File Details
Title PSMC results
Downloaded 4 times
Description [12 files] PSMC results for twelve selected samples.
Download psmc.tar.gz (492.1 Kb)
Details View File Details
Title SweeD Results
Downloaded 6 times
Description [42 files] SweeD output for each of the 21 chromosomes, run for the Eastern and Western populations separately.
Download SweeD.tar.gz (239.5 Kb)
Details View File Details
Title Analyses of chromosomes 11 and 12
Downloaded 2 times
Description [2 files] Statistics calculated in 50kb sliding windows for chromosomes 11 and 12. Nucleotide diversity, absolute divergence and Kst.
Download chr11_chr12.tar.gz (298.7 Kb)
Details View File Details
Title Genotypes 4D sites
Downloaded 4 times
Description Genotype calls for all individuals at all fourfold degenerate (4D) sites.
Download set80.Zupdated.realign80.GQ30.ALLSITES.4D.geno.gz (25.60 Mb)
Details View File Details
Title Genotypes Codon Pos 1
Downloaded 2 times
Description Genotype calls for all individuals at first codon positions.
Download set80.Zupdated.realign80.GQ30.CODON1.ALLSI...no.gz (51.25 Mb)
Details View File Details
Title Genotypes Codon Pos 2
Downloaded 3 times
Description Genotype calls for all individuals at second codon positions.
Download set80.Zupdated.realign80.GQ30.CODON2.ALLSI...no.gz (50.51 Mb)
Details View File Details
Title Genotypes Codon Pos 3
Downloaded 1 time
Description Genotype calls for all individuals at third codon positions.
Download set80.Zupdated.realign80.GQ30.CODON3.ALLSI...no.gz (58.96 Mb)
Details View File Details
Title Genotypes Intronic
Downloaded 4 times
Description Genotype calls for all individuals at intronic sites.
Download set80.Zupdated.realign80.GQ30.INTRON.ALLSI...no.gz (929.9 Mb)
Details View File Details
Title Genotypes 4D lowCUB
Downloaded 3 times
Description Genotypes for all individuals at 4D sites in genes showing minimal codon usage bias.
Download set80.Zupdated.realign80.GQ30.4D.lowCUB.geno.gz (16.80 Mb)
Details View File Details
Title Genotypes All Sites Part 1 of 4
Downloaded 4 times
Description Genotype calls for all individuals at all sites. Part 1 of 4.
Download set80.Zupdated.realign80.GQ30.ALLSITES.gen...t1.gz (840.3 Mb)
Details View File Details
Title Genotypes All Sites Part 2 of 4
Downloaded 2 times
Description Genotype calls for all individuals at all sites. Part 2 of 4.
Download set80.Zupdated.realign80.GQ30.ALLSITES.gen...t2.gz (850.6 Mb)
Details View File Details
Title Genotypes All Sites Part 3 of 4
Downloaded 2 times
Description Genotype calls for all individuals at all sites. Part 3 of 4.
Download set80.Zupdated.realign80.GQ30.ALLSITES.gen...t3.gz (838.5 Mb)
Details View File Details
Title Genotypes All Sites Part 4 of 4
Downloaded 1 time
Description Genotype calls for all individuals at all sites. Part 4 of 4.
Download set80.Zupdated.realign80.GQ30.ALLSITES.gen...t4.gz (740.6 Mb)
Details View File Details
Title Genotypes Intergenic Part 1 of 3
Downloaded 4 times
Description Genotype calls for all individuals at intergenic sites. Part 1 of 3.
Download set80.Zupdated.realign80.GQ30.INTERGENIC.A...t1.gz (740.5 Mb)
Details View File Details
Title Genotypes Intergenic Part 2 of 3
Downloaded 4 times
Description Genotype calls for all individuals at intergenic sites. Part 2 of 3.
Download set80.Zupdated.realign80.GQ30.INTERGENIC.A...t2.gz (744.4 Mb)
Details View File Details
Title Genotypes Intergenic Part 3 of 3
Downloaded 3 times
Description Genotype calls for all individuals at intergenic sites. Part 3 of 3.
Download set80.Zupdated.realign80.GQ30.INTERGENIC.A...t3.gz (594.4 Mb)
Details View File Details
Title asymptotic_alpha
Downloaded 2 times
Description [5 files] Four files give site frequency spectra for each gene separately, sampling down to 16 individuals from the Western population. The four files correspond to i) Autosomal genes, synonymous SNPs, ii) Autosomal genes, non-synonymous SNPs, iii) Z chromosome genes, synonymous SNPs, iv) Z chromosome genes, non-synonymous SNPs. The R script was used to calulcate aplpha and produce the plot in Fig. 4. See the text for further details.
Download asymptotic_alpha.tar.gz (189.5 Kb)
Details View File Details
Title sfs_v2
Downloaded 2 times
Description [18 files] Site frequency spectra for autosomal scaffolds considering different site classes: 4D sites, introns and intergenic sites. Each site was downsampled to 5 individuals per site, either considering all samples, or only those from a similar locality ('close'). Spectra for the Western and Eastern populations down sampling to 20 individuals per site are also given. See the text for further details.
Download sfs_v2.tar.gz (3.593 Kb)
Details View File Details
Title multiple_regression_v2
Downloaded 3 times
Description [14 files] Data and R code used for multiple regression analysis of neutral diversity. The raw unprocessed data are given in the files "set80.Zupdated.realign80.GQ30.4D.lowCUB.autoScafs.PiDxyGC.w100m150s100.csv" (pi, dxy and gc content for 100 kb windows); "set80.Zupdated.realign80.GQ30.4D.autoScafs.PiDxyGC.w100m250s100.csv" (the same but only for low-CUB genes); "mel1_hec1_wal1_hecu1_era1.realign80.GQ30.ALLSITES.cons.codeml_nW.w100m500.csv" (Paml analysis results, where branch 8..1 refers to the branch leading to H. melpomene); and "MK_gene_a.csv" (gene by gene a estimates). The "model data" files give the processed data that is imported by the R scripts to run the model, as described in the text.
Download multiple_regression_v2.tar.gz (1.742 Mb)
Details View File Details
Title diversity_around_substitutions.tar
Downloaded 6 times
Description [5 files]. Nucleotide diversity (pi) and divergece (dx) for each 4D site. The "distToNearest" files give, for each 4D site, the distance to the nearest substitution, either synonymous or nonsynonymous. For synonymous substitutions, the there are 100 bootstrapped distances, obtained by subsampling sysnonymous substitutions. There are distance files for two groups of substitutions, identified using different outgroups, as described in the text.
Download diversity_around_substitutions.tar.gz (294.1 Mb)
Details View File Details
Title simulated_error_rates
Downloaded 2 times
Description [20 files] Counts of paires of actual and inferred genotype patterns for simulated data at different percentage divergences and sequence coverage depths. Foer example "counts_01_00" refers to the number of sites at which the actual genotype was 0/1 and the inferred genotype was 0/0.
Download simulated_error.tar.gz (1.143 Kb)
Details View File Details
Title Nucleotide diversity for 100 kb windows V2
Downloaded 3 times
Description [17 files] Nucleotide diversity for each population calculated in 100kb windows. Seven different types of sites were considered: all sites, intergenic, intron, codon positions 1, 2 and 3, and four-fold degenerate (4D) sites. Dxy values between populations are also given, but these are based on just a single pair of samples and may be less accurate than those given in the dxy data. There are 8 files giving nucleotide diversity statistics considering only samples with at least 25x coverage - which should be more reliable. These correspond to the same seven site types listed above, plus one file for 4D sites only in genes showing minimal codon usage bias (CUB). There are also two files giving nucleotide diversity at intergenic sites and 3rd codon positions that were used to test whether the proportion of missing data in H. melpomene samples was correlated with nucleotide diversity.
Download pi_v2.tar.gz (1.395 Mb)
Details View File Details

When using this data, please cite the original publication:

Martin SH, Möst M, Palmer WJ, Salazar C, McMillan WO, Jiggins FM, Jiggins CD (2016) Natural selection and genetic diversity in the butterfly Heliconius melpomene. Genetics 203(1): 525-541. http://dx.doi.org/10.1534/genetics.115.183285

Additionally, please cite the Dryad data package:

Martin SH, Möst M, Palmer WJ, Salazar C, McMillan WO, Jiggins FM, Jiggins CD (2016) Data from: Natural selection and genetic diversity in the butterfly Heliconius melpomene. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.g0874
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: