Data from: Population structure and gene flow in the global pest, Helicoverpa armigera

Anderson CJ, Tay WT, McGaughran A, Gordon K, Walsh TK

Date Published: September 28, 2016

DOI: http://dx.doi.org/10.5061/dryad.875n5

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title FASTA mtDNA alignment, 12,248 bp assembly of resequencing data from heliothine species
Downloaded 4 times
Description Heliothine moths were collected between 2004 and 2014 from 16 different countries around the world across various climatic zones and altitudes (Tables S1 and S2), many of which are described in Behere et al. (2007); and Tay et al. (2013). Samples were collected as larvae from wild and crop host plants, as adult moths via light/pheromone traps, or as larvae after bioassay, and preserved in ethanol (>95%) or RNAlater, or stored at -20°C prior to DNA extraction. DNA was extracted from samples using DNeasy blood and tissue kits (Qiagen). Nextera libraries were produced following the manufacturer’s instructions and sequence was generated as 100 bp PE reads (Illumina HiSeq 2000, Biological Resources Facility, Australian National University, Canberra, Australia, as well as at Beijing Genomics Institute, Hong Kong). Sample and sequencing data are included in the supplementary material (Table S2). Raw sequence reads obtained from whole genome sequencing were aligned to the H. armigera mitochondrial genome using BBMap v. 33.43 (http://sourceforge.net/projects/bbmap/), permitting a minimum identity of 0.6 and allowing for a minimum quality threshold equivalent to Q10 over two consecutive bases before reads were trimmed. Reads were assembled using mira v. 4 (Chevreux et al. 2004) before mitobim v. 1.7 (Hahn et al. 2013) was used to iteratively map and assemble whole mitochondrial sequences. Heterozygous bases were removed, sequences were aligned using MAFFT v. 7.017 (Katoh 2002) and sequences were trimmed using the Gblocks v. 0.91b online server (http://molevol.cmima.csic.es/castresana/Gblocks_server.html) (Talavera & Castresana 2007).
Download mtdna.fasta (833.7 Kb)
Details View File Details
Title SNP data in plink bed format from GBS analysis of Helicoverpa armigera, H. zea and H. punctigera
Downloaded 10 times
Description Heliothine moths were collected between 2004 and 2014 from 16 different countries around the world across various climatic zones and altitudes (Tables S1 and S2), many of which are described in Behere et al. (2007); and Tay et al. (2013). Samples were collected as larvae from wild and crop host plants, as adult moths via light/pheromone traps, or as larvae after bioassay, and preserved in ethanol (>95%) or RNAlater, or stored at -20°C prior to DNA extraction. DNA was extracted from samples using DNeasy blood and tissue kits (Qiagen), before being quantified with a Qubit 2.0.GBS library preparation and sequencing was performed by the Genomic Diversity Facility, Cornell University, NY, USA. Information regarding the samples used and sequencing output is recorded in the supplementary material (Table S1). Briefly, 50 ng of gDNA was digested using PstI, before library construction as in Elshire et al. (2011) and sequencing using an Illumina Hiseq. A negative control was included with each plate. Raw data were assessed for quality and processed using Stacks v. 1.30 (Catchen et al. 2013b). Briefly, process_radtags was used to demultiplex samples, trim to 90 bp and assess the quality of reads before being forwarded to denovo_map, which was run using default settings. The Populations module was then run, limiting the output to loci existing in at least 5% of samples from each sampling location, with at least 5x coverage. The Populations module was used to output SNP data in Plink format.
Download denovogbs.tar (2.040 Mb)
Details View File Details
Title A VCF file of all whole-genome sequenced heliothine indviduals, aligned to BACs (not those containing CYP337B1, 2 or 3) available on NCBI
Downloaded 1 time
Description BAC descriptions are available in the supplementary document. Heliothine moths were collected between 2004 and 2014 from 16 different countries around the world across various climatic zones and altitudes (Tables S1 and S2), many of which are described in Behere et al. (2007); and Tay et al. (2013). Samples were collected as larvae from wild and crop host plants, as adult moths via light/pheromone traps, or as larvae after bioassay, and preserved in ethanol (>95%) or RNAlater, or stored at -20°C prior to DNA extraction. DNA was extracted from samples using DNeasy blood and tissue kits (Qiagen), before being quantified with a Qubit 2.0. Nextera libraries were produced following the manufacturer’s instructions and sequence was generated as 100 bp PE reads (Illumina HiSeq 2000, Biological Resources Facility, Australian National University, Canberra, Australia, as well as at Beijing Genomics Institute, Hong Kong). Sample and sequencing data are included in the supplementary material (Table S2). Raw reads were aligned to BAC sequences, originally derived from H. armigera and available on NCBI (accessions in supplementary document), using BBMap. Reads were trimmed when quality in at least 2 bases fell below Q10. Only uniquely aligning reads were included in the analysis, to prevent spuriously inferring evolutionary processes occurring independently on each BAC. Outputted BAM files were sorted before duplicate reads were removed and files were annotated with read groups using Picard v. 1.138 (http://picard.sourceforge.net). BAC reference sequences were indexed using Samtools v. 1.1.0 (Li et al. 2009). UnifiedGenotyper in GATK v. 3.3-0 (McKenna et al. 2010) was used to estimate genotypes across all individuals simultaneously, implementing a heterozygosity value of 0.01. Variant call format files containing SNP calls were reformatted into Plink format using VCFtools v. 0.1.12b (Danecek et al. 2011).
Download all_bacs.vcf.bz2 (193.3 Mb)
Details View File Details
Title Supplementary document containing accession codes and eigenstrat analyses of all BACs used as references for whole genome sequencing of heliothine species
Downloaded 6 times
Description Supplementary document containing accession codes and eigenstrat analyses of all BACs used as references for whole genome sequencing of heliothine species.
Download Supplementary document FINAL.docx (712.9 Kb)
Details View File Details
Title vcf file of all heliothine individuals that have undergone whole genome sequencing aligned to B3 and B1/B2 BAC
Downloaded 7 times
Description Chromosome "1" contains the B3 BAC, Chromsome "B1_B2" contains the B1/B2 BAC. Heliothine moths were collected between 2004 and 2014 from 16 different countries around the world across various climatic zones and altitudes (Tables S1 and S2), many of which are described in Behere et al. (2007); and Tay et al. (2013). Samples were collected as larvae from wild and crop host plants, as adult moths via light/pheromone traps, or as larvae after bioassay, and preserved in ethanol (>95%) or RNAlater, or stored at -20°C prior to DNA extraction. DNA was extracted from samples using DNeasy blood and tissue kits (Qiagen), before being quantified with a Qubit 2.0. Nextera libraries were produced following the manufacturer’s instructions and sequence was generated as 100 bp PE reads (Illumina HiSeq 2000, Biological Resources Facility, Australian National University, Canberra, Australia, as well as at Beijing Genomics Institute, Hong Kong). Sample and sequencing data are included in the supplementary material (Table S2). Raw reads were aligned to BAC sequences, originally derived from H. armigera and available on NCBI (accessions in supplementary document), using BBMap. Reads were trimmed when quality in at least 2 bases fell below Q10. Only uniquely aligning reads were included in the analysis, to prevent spuriously inferring evolutionary processes occurring independently on each BAC. Outputted BAM files were sorted before duplicate reads were removed and files were annotated with read groups using Picard v. 1.138 (http://picard.sourceforge.net). BAC reference sequences were indexed using Samtools v. 1.1.0 (Li et al. 2009). UnifiedGenotyper in GATK v. 3.3-0 (McKenna et al. 2010) was used to estimate genotypes across all individuals simultaneously, implementing a heterozygosity value of 0.01. Variant call format files containing SNP calls were reformatted into Plink format using VCFtools v. 0.1.12b (Danecek et al. 2011). When linkage disequilibrium (LD)-based pruning was necessary, Plink v. 1.07 (Purcell et al. 2007) was used to filter one of a pair of SNPs using a pairwise LD threshold (r2=0.5) within windows of 50 SNPs, moving forwards 5 SNPs per iteration.
Download b3_b1b2_snps.vcf.bz2 (28.12 Mb)
Details View File Details

When using this data, please cite the original publication:

Anderson CJ, Tay WT, McGaughran A, Gordon K, Walsh TK (2016) Population structure and gene flow in the global pest, Helicoverpa armigera. Molecular Ecology 25(21): 5296–5311. http://dx.doi.org/10.1111/mec.13841

Additionally, please cite the Dryad data package:

Anderson CJ, Tay WT, McGaughran A, Gordon K, Walsh TK (2016) Data from: Population structure and gene flow in the global pest, Helicoverpa armigera. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.875n5
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: