Skip to main content

DNA sequence data - Bicyclus

Cite this dataset

Aduse-Poku, Kwaku (2022). DNA sequence data - Bicyclus [Dataset]. Dryad.


Compared to other regions, the drivers of diversification in Africa are poorly understood. We studied a radiation of insects with over 100 species occurring in a wide range of habitats across the Afrotropics to investigate the fundamental evolutionary processes and geological events that generate and maintain patterns of species richness on the continent. By investigating the evolutionary history of Bicyclus butterflies within a phylogenetic framework, we inferred the group’s origin at the Oligo-Miocene boundary from ancestors in the Congolian rainforests of central Africa. Abrupt climatic fluctuations during the Miocene (ca. 19–17 Ma) likely fragmented ancestral populations, resulting in at least eight early-divergent lineages. Only one of these lineages appears to have diversified during the drastic climate and biome changes of the early Miocene, radiating into the largest group of extant species. The other seven lineages diversified in forest ecosystems during the late Miocene and Pleistocene when climatic conditions were more favorable—warmer and wetter. Our results suggest changing Neogene climate, uplift of eastern African orogens, and biotic interactions have had different effects on the various subclades of Bicyclus, producing one of the most spectacular butterfly radiations in Africa.


Of the 102 currently recognized Bicyclus species, 94 (92%) were included in this study (Supplementary Appendix S1 available on Dryad at The included samples covered all 16 currently recognized species-groups (Aduse-Poku et al. 2017). We also included the two recognized species of its sister genus, Hallelesis (H. halyma and H. asochis) and 11 other closely related satyrinae taxa as outgroups, selected on the basis of evolutionary relationships recovered in two earlier higher-level phylogenetic studies (Espeland et al. 2018Chazot et al. 2019). We used a total of ten protein-coding loci: one mitochondrial (cytochrome c oxidase subunit I, COI) and nine nuclear (carbamoyl phosphate synthetase domain protein, CAD; ribosomal protein S5, RpS5; ribosomal protein S2, RpS2; wingless, wgl; cytosolic malate dehydrogenase, MDH; glyceraldehyde-3-phosphate dehydrogenase, GAPDH; elongation factor 1 alpha, EF-1αα⁠; and arginine kinase, ArgKin and isocitrate dehydrogenase, IDH) (Wahlberg and Wheat 2008).

Most sequences used in this study were obtained from Aduse-Poku et al. (2017); additional sequences from six taxa were obtained using the protocols described in that study. All sequences were aligned manually using BioEdit 7.2 (Hall 1999) with properties and reading frames of protein-coding genes examined in MEGA X 10.0.5 (Kumar et al. 2018). Individual gene trees were first generated using IQ-TREE 1.6.11 (Nguyen et al. 2015) to check for contamination and sequence quality. Cleaned sequences were then concatenated to produce a final matrix of up to 7735 aligned nucleotides from 107 taxa (Supplementary Appendix S1 available on Dryad).