Skip to main content

A genomic and morphometric analysis of alpine bumblebees: Ongoing reductions in tongue length but no clear genetic component

Cite this dataset

Webster, Matthew T. et al. (2022). A genomic and morphometric analysis of alpine bumblebees: Ongoing reductions in tongue length but no clear genetic component [Dataset]. Dryad.


Over the last six decades, populations of the bumblebees Bombus sylvicola and Bombus balteatus in Colorado have experienced decreases in tongue length, a trait important for plant-pollinator mutualisms. It has been hypothesized that this observation reflects selection resulting from shifts in floral composition under climate change. Here we used morphometrics and population genomics to determine whether morphological change is ongoing, investigate the genetic basis of morphological variation, and analyse population structure in these populations.

We analysed whole-genome sequencing data and morphometric measurements of 580 samples of both species from seven high-altitude localities. Out of 281 samples originally identified as Bsylvicola, 67 formed a separate genetic cluster comprising a newly-discovered cryptic species (“incognitus”). However, an absence of genetic structure within species suggests that gene flow is common between mountains. We did not discover any genetic associations with tongue length, but a SNP related to production of a proteolytic digestive enzyme was implicated in body size variation. We identified evidence of covariance between kinship and both tongue length and body size, which is suggestive of a genetic component of these traits, although it is possible that shared environmental effects between colonies are responsible. Our results provide evidence for ongoing modification of a morphological trait important for pollination and indicate that this trait probably has a complex genetic and environmental basis.

This archive contains genetic variation data derived from genome sequencing of 580 bumblebee samples collected from high-elevation locations in Colorado. The species are Bombus sylvicola (n=214), Bombus balteatus (n=299) and "incognitus" (n=67).


We extracted DNA from the thoraces of bumblebees collected from across the seven sampling sites using the Qiagen Blood and Tissue kit. We prepared dual-indexed libraries using the Nextera Flex kit and performed sequencing on an Illumina HiSeq X to produce 2 × 150 bp reads, using an average of 36 samples per lane. The Bombus sylvicola and "incognitus" samples are mapped to the Bombus sylvicola reference assembly (GCA_019677175.1) and the Bombus balteatus samples are mapped to the Bombus balteatus assembly (GCA_019201815.1).We mapped reads to the two reference genome assemblies using the mem algorithm in BWA. We performed sorting and indexing of the resultant bam files using samtools and marked duplicate reads using Picard. We used the genome analysis toolkit (GATK) to call variants. We first ran HaplotypeCaller using default parameters on the bam file of each sample to generate a gVCF file for each sample. We then used GenomicsDBImport and GenotypeGVCFs with default parameters to call variants for all samples mapping to each reference assembly seperately. We applied a set of hard filters using the VariantFiltration tool to filter for reliable SNPs using the following thresholds: QD <2, FS >60, MQ <40, MQRankSum <−12.5, ReadPosRankSum <−8 (see the GATK documentation for full descriptions of each filter). Only biallelic SNPs were considered for downstream analysis.

Usage notes

The files are VCF (variant call format) files generated by GATK (genome analysis toolkit). They are plain text files that can be read by a wide range of bioinformatic software including VCFtools. A description of the format is here:



Swedish Research Council for Environment Agricultural Sciences and Spatial Planning, Award: 2016-00535

Vetenskapsrådet, Award: 2018‐05973

Science for Life Laboratory, Award: NP00046