Data from: Paralogs are revealed by proportion of heterozygotes and deviations in read ratios in genotyping by sequencing data from natural populations

McKinney GJ, Waples RK, Seeb LW, Seeb JE

Date Published: October 18, 2016

DOI: http://dx.doi.org/10.5061/dryad.cm08m

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title HDplot R code
Downloaded 41 times
Description Code to run HDplot in R with generic format as input. Example input file is HDplot_R_genericInput
Download HDplot.R (722 bytes)
Details View File Details
Title HDplot_R_genericInput
Downloaded 26 times
Description Example input file for HDplot R package. Each row is a locus entry, there are five columns of data associated with each locus. Column locus_ID contains the locus name. Sequence reads for each allele are given in columns depth_a and depth_b. Sequence reads for each allele are summed over heterozygous individuals for each locus. The num_hets column indicates the number of heterozygous individuals for each locus while the num_samples column contains the total number of individuals in the data set.
Download HDplot_R_genericInput.txt (451.3 Kb)
Details View File Details
Title HDplot_python
Downloaded 21 times
Description Python code to run HDplot. This can take input directly from the Stacks program in the form of the .vcf output from Stacks. The file HDplot_python_exampleInput.vcf is included as an example of the required format. The file vcf_to_depth.py is necessary to run HDplot_python and must be included in the same directory.
Download HDplot_python.py (2.121 Kb)
Details View File Details
Title vcf_to_depth
Downloaded 20 times
Description Python package called by HDplot_python.py that extracts sequence read counts from the .vcf format input.
Download vcf_to_depth.py (3.426 Kb)
Details View File Details
Title HDplot_python_exampleInput
Downloaded 24 times
Description Example .vcf format input for the HDplot_python.py program.
Download HDplot_python_exampleInput.vcf (43.46 Mb)
Details View File Details
Title HDplot_simulation
Downloaded 15 times
Description R code to simulate data for HDplot. This code also runs HDplot internally and produces plots showing expected distributions based on simulation parameters. Simulation parameters include the number of singleton, duplicate, and diverged duplicate loci, the total population size, the sampled population size, average read depth per locus, statistical distribution of reads per locus to sample from and distribution of reads per allele to sample from.
Download HDplot_simulation.R (22.92 Kb)
Details View File Details
Title Chinook_sequenceReads
Downloaded 4 times
Description Chinook salmon dataset processed with HDplot in this manuscript. File is in the .vcf format output by the Stacks genotyping program.
Download Chinook_sequenceReads.vcf (115.0 Mb)
Details View File Details
Title Barberry_sequenceReads
Downloaded 4 times
Description Mountain Barberry dataset processed with HDplot in this manuscript. File is in the .vcf format output by the Stacks genotyping program.
Download Barberry_sequenceReads.vcf (4.969 Mb)
Details View File Details
Title Parrotfish_sequenceReads
Downloaded 7 times
Description Dusky Parrotfish dataset processed with HDplot in this manuscript. File is in the .vcf format output by the Stacks genotyping program.
Download Parrotfish_sequenceReads.vcf (39.54 Mb)
Details View File Details

When using this data, please cite the original publication:

McKinney GJ, Waples RK, Seeb LW, Seeb JE (2017) Paralogs are revealed by proportion of heterozygotes and deviations in read ratios in genotyping-by-sequencing data from natural populations. Molecular Ecology Resources 17(4): 656-669. http://dx.doi.org/10.1111/1755-0998.12613

Additionally, please cite the Dryad data package:

McKinney GJ, Waples RK, Seeb LW, Seeb JE (2016) Data from: Paralogs are revealed by proportion of heterozygotes and deviations in read ratios in genotyping by sequencing data from natural populations. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.cm08m
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: