Skip to main content

Radiation of tropical island bees and the role of phylogenetic niche conservatism as an important driver of biodiversity

Cite this dataset

Dorey, James B et al. (2020). Radiation of tropical island bees and the role of phylogenetic niche conservatism as an important driver of biodiversity [Dataset]. Dryad.


Island biogeography explores how biodiversity in island ecosystems arises and is maintained. The topographical complexity of islands can drive speciation by providing a diversity of niches that promote adaptive radiation and speciation. However, recent studies have argued that phylogenetic niche conservatism, combined with topographical complexity and climate change, could also promote speciation if populations are episodically fragmented into climate refugia that enable allopatric speciation. Adaptive radiation and phylogenetic niche conservatism therefore both predict that topographical complexity should encourage speciation, but they differ strongly in their inferred mechanisms. Using genetic (mtDNA and SNP) and morphological data we show high species diversity (22 species) in an endemic clade of Fijian Homalictus bees, with most species restricted to highlands and frequently exhibiting narrow geographical ranges. Our results indicate that elevational niches have been conserved across most speciation events, contradicting expectations from an adaptive radiation model but concordant with phylogenetic niche conservatism. Climate cycles, topographical complexity, and niche conservatism could interact to shape island biodiversity. We argue that phylogenetic niche conservatism is an important driver of tropical bee biodiversity but that this phylogenetic inertia also leads to major extinction risks for tropical ectotherms under future warming climates.


Sample locations and collection methods. Collections throughout Fiji were made between 2010 and 2017 from multiple localities including the main islands of Viti Levu, Vanua Levu, Kadavu and Taveuni, as well as multiple small islands in the Lau group (SI Appendix, Fig. S5). Sampling of specimens at each location was not biased towards particular species because, for these very small bees, only H. achrostuscould be easily identified in the field due to its distinctive coloration; all other species required microscopy or DNA sequencing for species identification.

Samples were collected from 3 m to 1,324 m asl (highest elevation of Fiji) by sweep netting both native and introduced plants, and from nesting aggregations along roadsides. For each collection site, latitude, longitude and elevation were recorded using a Garmin 550 (Garmin Ltd., USA); latitude and longitude were then checked against satellite images (Google Earth) to confirm accuracy. Once collected, bees were immediately transferred into vials containing 98% ethanol. Vials were kept cool at ~5˚C and ethanol was replaced within a week of collection to reduce DNA degradation.

Maps of Fiji were produced in ArcMap (50)and a digital elevation model (DEM) of the archipelago was provided by Fiji Lands Information System (FLIS).

Sampling bias and elevational species richness. It was not possible to evenly sample bees across all geographical and elevational regions of Fiji because physical access to many regions was restricted by terrain and lack of roads. Access constraints could therefore affect sampling effort and this, in turn, could influence ability to recover true species richness in different elevational bands.  Here, we quantize sampling effort as the number of DNA sequences obtained for different elevations, categorized into 200 m asl bands. Because specimens were only identified to species levels after DNA sequencing, the number of obtained sequences represents sampling effort. We examined whether this sampling effort may have influenced our estimates of species richness using multiple regression, where the number of detected species was the dependent variable and the number of sequences (sampling effort) and elevational band were the independent variables. The relative importance of sampling effort and elevation band for detected species richness can then be explored by regression bvalues and their statistical significance.

DNA extraction and sequencingTissue samples for DNA extraction were obtained by removing a single hind leg from each of the 764 specimens. For all samples obtained after 2014, DNA extraction and PCR amplification was completed at the South Australian Regional Facility for Molecular Ecology and Evolution (SARFMEE). DNA extraction and PCR amplification of COI prior to the 2014 samples was completed at the Canadian Centre for DNA Barcoding (CCDB) at the Biodiversity Institute of Ontario (31)and amplification used the universal primer pair LepF1 and LepR2 (31, 51). Extractions at SARFMEE followed protocols described by (52)with the subsequent DNA eluted into 75 µL of TLE buffer. PCR amplification of the 710 bp fragment of the DNA (COI) was completed using the primers LCO1490 (forward) and HCO2198 (reverse). The 25 µL PCR reactions comprised the following reagents: Sterile H2O (15.9 µL), MRT buffer (5 µL), 1 µL (5 µM) of LCO1490, 1 µL(5 µM) of HCO2198, Immolase Taq (0.1 µL) and DNA from specimen (2 µL). The thermocycling regime comprised of one cycle at 94˚C for 10 minutes, then five cycles at 94˚C for 60 seconds, 46˚C for 90 seconds, 72˚C for 75 seconds, followed by 35 cycles at 94˚C for 60 seconds, 51˚C for 90 seconds, 72˚C for 75 seconds, followed by 72˚C for 10 minutes and then 25˚C for 2 minutes. 

Sequences were checked against the NCBI BLAST database to screen for non-target DNA. Forward and reverse sequences were aligned and chromatograms visually checked before creating final consensus sequences in Geneious version 10.2.2 (53). Initial alignments were trimmed to 630 bp to avoid any problems associated with missing data.

Phylogenetic, elevational and species analyses. The full COI alignment consisted of 630 bp for 764 specimens. Partition finder version 2 was employed using BIC and a greedy algorithm to find the best partition schemes and DNA substitution models from widely-used (i.e. MrBayes) models (54-56). The first and second codon positions were combined into a single partition with an HKY + I substitution model. A GTR substitution model was applied to third codon position. The BEAST file and parameters for phylogenetic analyses were set using BEAUti version 1.8.4 (57). Because of the small numbers of substitutions on each branch, a strict clock was used to avoid overparameterization. To infer changes in elevation across the tree we included elevation as a continuous trait using a strict or relaxed Brownian motion model (confirmed as adequate given our λ, κ, and δestimates;Table 1). Phylogenetic analyses were implemented in BEAST version 1.10(34)with 200 million iterations sampled every 50,000thiteration. Resulting log files were analyzed in Tracer version 1.6 (58)and a burnin of 2.5x107iterations was employed, which was always after stationarity had been achieved. Maximum clade credibility trees and posterior probability support values were obtained using TreeAnnotator Version 1.8.4 (57). Each run was performed four times for each analysis to ensure consistent results and stationarity. Post-burnin log and tree files for each run were then combined using LogCombiner version 2.5.2 (59)for the final analysis.

To infer the evolutionary mode and phylogenetic signal in the elevation data, we used BayesTraits version 3.0 (29). The tree-transformation models employed in BayesTraits assume that each terminal taxon is a species, hence we repeated the BEAST analysis using only one DNA sample from each species, and elevation data as either the median or minimum for all samples of that species. The (reduced) BEAST analysis used 100 million iterations, sampling every 50,000thiteration; stationarity and burnin was checked as above. The resulting consensus tree was run in BayesTraits using the median and the minimum elevational value for each terminal taxon to estimate λ (degree of phylogenetic signal), κ (degree of punctuated evolution), and δ (degree of early burst, adaptive radiation). The model of best fit for each estimate was chosen using Akaike’s Information Criterion with 100 bootstrap replicates in Tracer (60). Analyses in BayesTraits used 500 million iterations sampled every 50,000thiteration. Each run was performed four times for each model at each elevation to ensure consistent results. BayesTraits log files for each run were then combined using LogCombiner version 2.5.2 (59)for the final analysis.

We attempted to co-estimate phylogeny and elevational niche evolution, but these analyses repeatedly failed to converge. Thus, to infer elevational changes across the full phylogeny, we mapped elevation across all post-burnin trees sampled in the full COI analysis. This was done using BEAST, under a standard rate-constant Brownian motion model, as well as a rate-variable Brownian motion model, which assumes rates vary across branches according to an uncorrelated relaxed clock (35). Stationarity and burnin were confirmed as above. Both models gave very similar ancestral state reconstructions, but the latter model fitted better and is shown in Fig. 1. 

Genetic analyses of bee clades were explored using Arlequin version 3.11 (61). For each species with multiple haplotypes and a sample size of more than 10 specimens we calculated haploytype diversity (h) and pairwise FSTvalues.  


SNP quality filtering and analyses. The thorax and front legs were taken from 19 individuals from H. fijiensis, H. tuiwawae, H. ostridorsum, H. groomi and H.sp.S, respectively. To perform Restriction-site Associated DNA sequencing (RAD seq), the solid state method Diversity Arrays Technology (DArT) was used (62). The restriction enzymes used werea combination of PstI and HpaIIenzymes. Only female specimens were used to avoid the impact of male haploidy on SNP diversity. Post filtering, missing data was capped at 1.16%.

A total of 62,426 SNP loci were called across all species. Using the R package ‘DArTR’ version 1.0.5 low quality loci were removed at a threshold of 0.85% removing loci with 15% or more missing values (63), leading to retention of 8,381 SNP loci.The neighbour joining tree (SI Appendix, Fig. S4)was made using the R package ‘ape’ with the ‘nj’ function (64).

Once SNP data were filtered they were subjected to a discriminant analysis of principal components (DAPC) using the DAPC procedure (65)in the Adegenet package in R (66). The DAPC was used to identify the number of genetic clusters within the SNP data and the relationship between these clusters. DAPC uses synthetic variables constructed as linear combinations from the original alleles, showing the largest between group variations and lowest within group variation. Discriminant analysis also provides membership probabilities of each individual to the different clusters. Our DAPC followed protocols outlined by Jombart (65). 

Usage notes

Files are uploaded in a zip folder. Within this zip, files and folders are labelled according to analyses. These materials are mostly run files.


New Colombo Plan

New Colombo Plan

Playford Trust