SNP Data for Aedes aegypti populations in Florida and southern California
Cite this dataset
Pless, Evlyn; Powell, Jeffrey (2021). SNP Data for Aedes aegypti populations in Florida and southern California [Dataset]. Dryad. https://doi.org/10.5061/dryad.8gtht76m8
In the affiliated paper we compare likely the oldest populations of Aedes aegypti in continental North America with some of the newest to illuminate the range of genetic diversity and structure that can be found within the invasive range of this important disease vector. Aedes aegypti populations in Florida have likely persisted since the 1600-1700s, while populations in southern California derive from new invasions that occurred in the last ten years. For this comparison, we genotyped 1,193 individuals from 29 sites at 12 highly variable microsatellites and a subset of these individuals at 23,961 single nucleotide polymorphisms (SNPs). This dataset contains the SNP genetic information.
A total of 156 individuals from ten Florida sites and four southern California sites were genotyped for single-nucleotide polymorphisms (SNPs) using Axiom_aegypti, a high-throughput genotyping chip that has 50,000 probes (Evans et al. 2015). Genotyping was conducted by the Functional Genomics Core at University of North Carolina, Chapel Hill. To prune the SNPs, we first excluded 2,166 that failed a test of Mendelian inheritance (Evans et al. 2015) Since some analyses can be confounded by SNPs in linkage disequilibrium, we excluded tightly linked SNPs with Plink 1.9 using the command “--indep-pairwise 50 5 0.5”. We also excluded any SNPs that genotyped in less than 98% of the individuals and those with a minor allele frequency of <1%, as these could be genotyping errors, leaving 23,961 SNPs remaining for analysis.
Evans, B. R., A. Gloria-Soria, L. Hou, C. McBride, M. Bonizzoni, H. Zhao, and J. R. Powell. 2015. A multipurpose, high-throughput single-nucleotide polymorphism chip for the dengue and yellow fever mosquito, Aedes aegypti. G3 (Bethesda) 5:711-718.
We include files in Plink format (ped and map). See here for more information on data formats: https://www.cog-genomics.org/plink/1.9/formats. The ped files show SNP calls, and no data is represented as "0." We include these files for the full, unfiltered dataset and the filtered dataset. For more information about the populations and their abbreviations, see the attached file and affiliated manuscript.
National Cancer Institute, Award: RO1 AI101112