Skip to main content

Climate adaptation and genetic differentiation in the mosquito species Culex tarsalis

Cite this dataset

Liao, Yunfei (2024). Climate adaptation and genetic differentiation in the mosquito species Culex tarsalis [Dataset]. Dryad.


The increasing prevalence of vector-borne diseases around the world highlights the pressing need for an in-depth exploration of the genetic and environmental factors that shape the adaptability and widespread distribution of mosquito populations. This research focuses on Culex tarsalis, a principal vector for various viral diseases including West Nile Virus (WNV). Through the development of a new reference genome and the examination of Restriction-Site Associated DNA sequencing (RAD-seq) data from over 300 individuals and 28 locations, we demonstrate that variables such as temperature, evaporation rates, and the density of vegetation significantly impact the genetic makeup of Cx. tarsalis populations. Among the alleles most strongly associated with environmental factors is a nonsynonymous mutation in a key gene related to circadian rhythms.  These results offer new insights into the mechanisms of spread and adaptation in a key North American vector species, which is poised to become a growing health threat to both humans and animals in the face of ongoing climate change.


Sample Collection

Individual mosquitoes were trapped and collected from 28 different locations across the United States and Canada as part of the North American Mosquito Project (NAMP).  All samples used in this study were collected in 2012 between the months of April and October.

Genome Sequencing, Assembly, and Annotation

An F4  population was used to generate the reference genome assembly, and high molecular weight DNA was extracted and sequenced on a Pacific Biosciences (PacBio) RS II (University of Delaware). Thirty-five SMRTcells were generated. The resulting reads provided 76X coverage of the ~790Mb Cx. tarsalis genome, and were assembled with MECAT Gene annotation was completed by MAKER using EST and protein data from the Culex quinquefasciatus and Aedes aegypti mosquitoes. Sequences were downloaded from the NCBI Taxonomy database and both Trinotate and InterProScan were used for functional annotation of the MAKER predicted genes.  The annotated assembly was assessed for completeness and quality using BUSCO and QUAST.

RAD-Seq Library Preparation, Sequencing, and SNP Calling

DNA was extracted from individual mosquitoes and libraries were constructed for Restriction-site Associated DNA Sequencing (RAD-Seq) according to previously established protocols.  The SbfI enzyme was used to digest purified DNA, and individual samples were barcoded prior to Illumina sequencing.  Raw sequencing reads were subsequently filtered to remove any reads with an uncalled base, an error in the restriction enzyme cut site, or with an average Phred quality score less than 20 over 15 consecutive nucleotides.  Filtered reads were then de-multiplexed using the Stacks software package.

After de-multiplexing, raw reads from each individual were aligned to the draft assembly of the Cx. tarsalis genome using BWA MEM, and individuals with poor mapping rates (less than 50%) were excluded from subsequent analyses.  The mapped reads for the remaining 378 samples were then merged using the Samtools pipeline and SNPs were called using the GATK HaplotypeCaller.  The SNPs were filtered using VCFtools v0.1.12a to retain only sites with a minimum average individual read depth of 10X and a maximum of 20% missing data, resulting in a total of 457,387 sites.  Individual samples were then filtered again to remove individuals with missing data at more than 50% of the remaining SNP sites, leaving 322 samples from 28 different locations for further analysis.

Environmental data

Climate data was extracted from the ERA5-Land monthly averaged dataset provided by the Copernicus Climate Change Service. The original dataset was characterized by a temporal resolution of 1 hour and a native spatial resolution of 9 km on a reduced Gaussian grid (TCo1279). To facilitate broader accessibility and suitability for diverse analyses, the data underwent regridding to a regular lat-lon grid with a finer resolution of 0.1x0.1 degrees.