Skip to main content

Admixture in Africanized honey bees (Apis mellifera) from Panamá to San Diego, California (U.S.A.) honey bee dataset

Cite this dataset

Zarate, Daniela (2022). Admixture in Africanized honey bees (Apis mellifera) from Panamá to San Diego, California (U.S.A.) honey bee dataset [Dataset]. Dryad.


The Africanized honey bee (AHB) is a New World amalgamation of several subspecies of the western honey bee (Apis mellifera), a diverse taxon historically grouped into four major biogeographic lineages: A (African), M (Western European), C (Eastern European), and O (Middle Eastern). In 1956, accidental release of experimentally bred “Africanized” hybrids from a research apiary in Sao Paulo, Brazil, initiated a hybrid species expansion that now extends from northern Argentina to northern California (U.S.A.). Here, we assess nuclear admixture and mitochondrial ancestry in 60 bees from four countries (Panamá; Costa Rica, Mexico; U.S.A) across this expansive range to assess ancestry of AHB several decades following initial introduction and test the prediction that African ancestry decreases with increasing latitude. We find that AHB nuclear genomes from Central America and Mexico have predominately African genomes (76–89%) with smaller contributions from Western and Eastern European lineages. Similarly, nearly all honey bees from Central America and Mexico possess mitochondrial ancestry from the African lineage with few individuals having European mitochondria. In contrast, AHB from San Diego (CA) show markedly lower African ancestry (38%) with substantial genomic contributions from all four major honey bee lineages and mitochondrial ancestry from all four clades as well. Genetic diversity measures from all New World populations equal or exceed those of ancestral populations. Interestingly, the feral honey bee population of San Diego emerges as a reservoir of diverse admixture and high genetic diversity, making it a potentially rich source of genetic material for honey bee breeding.


We collected 60 Western honey bees (n = 15/country) from sites in each of four countries: the isthmus of Panamá; Guanacaste National Park, Costa Rica; Chiapas, Mexico; San Diego County, California, U.S.A. (Table 1). All samples were collected in June 2015 – August 2016 by hand-netting. Honey bees in Panamá were collected with an insect net while they foraged either on natural vegetation in rural areas, or on street vendor syrup dispensers in urban areas. Honey bees were collected across the isthmus of Panamá from five sites, each separated by > 5 km: Panamá City, Gamboa, Barro Colorado Island (BCI), Santa Rita Arriba, and Cólon. Individuals from Costa Rica were collected from the Santa Rosa sector of Guanacaste National Park in northwestern Costa Rica. These bees were collected from a localized region and likely originate from a small number of feral colonies. Honey bees from Mexico were collected from an apiary in the southern state of Chiapas, with each bee collected from a different hive. Honey bees from San Diego County, California, U.S.A. were workers collected while foraging on flowers. San Diego bees were collected across 15 sites each separated by > 5 km so that each likely represents a worker from a different colony. The furthest collection sites were separated by 65 km. Collection sites ranged from urban to rural settings. Due to the presence of hobbyist and agricultural beekeeping we do not rule out the possibility that the captured honey bees were from managed rather than feral hives. However, most honey bee foragers in San Diego are from feral hives. We extracted DNA from crushed heads of the 60 sampled honey bees using the standard protocol of the Qiagen DNAeasy Blood & Tissue extraction kit. DNA purity and appropriate concentration for sequencing were validated with a Qubit fluorometer prior to submission for library preparation. The DNA was submitted for DNA KAPA library construction and whole-genome sequencing at the Institute for Genomic Medicine (IGM), UC San Diego. All 60 individuals were multiplexed and sequenced across three lanes of an Illumina HiSeq4000 platform to produce 100-bp paired end reads. Average genomic coverage per individual was 29±1.2X.