Skip to main content

Data for: Biogeographic history predicts bee community structure across floral resource gradients in southeast Australia

Cite this dataset

Brown, Julian; Cunningham, Saul (2022). Data for: Biogeographic history predicts bee community structure across floral resource gradients in southeast Australia [Dataset]. Dryad.


Aim: Plants populations are declining in their native ranges around the globe through the expansion of agriculture, urbanization, and plant invasions. We test the hypothesis that animal species that have spent more of their evolutionary history in a region are more dependent on native plants, particularly those plants that have spent more of their evolutionary history in the region, and are therefore more negatively impacted by native plant decline.

Location: Yarra Valley landscapes, Australia

Methods: We test the presence and pattern of phylogenetic signal in native bee community responses to local flower density of ancient Australian plant lineages and the amount of native vegetation in the surrounding landscape across farm and native vegetation sites. We also test phylogenetic signal in the frequency of bee visitation to flowers from ancient Australian plant lineages. We compare the patterns of phylogenetic signal to the current understanding of bee biogeographic histories to evaluate our hypothesis.

Results: There was significant phylogenetic signal in responses to flower density of plants from ancient Australian lineages, and the frequency of visitation to these flowers, with most species from the ancient Australian bee clade being positively associated with these flowers. This is consistent with our hypothesis. Significant phylogenetic signal in response to native vegetation in the surrounding landscape was driven primarily by the more recently arrived bee linages, with ancient lineages able to persist on some farms where ancient Australian flowers were present (e.g. on roadsides).  

Main conclusions: Bee species that have spent more of their evolutionary history in Australia are more dependent on ancient Australian plant lineages and so most negatively impacted by the decline of these plants. This may be a broader phenomenon because phylogenetic conservatism in host plant use, the main assumption underlying our hypothesis, is common among herbivorous arthropods (~500,000 species).


Bee sampling

We surveyed bees visiting flowers within a single circular sampling zone (100 m radius) at each site during spring (September-November) in 2017 and 2018. Sampling zones were positioned so as to avoid non-vegetated areas (e.g. farm dams and buildings) and represent the floral resources available at the site (e.g. sample zones included crops, weeds, and roadside native vegetation in small orchards bordering remnant vegetation). Each site was surveyed four times in 2017 and five times in 2018. Each survey was conducted between 10:30 and 16:30 when conditions were suitable for bee foraging (>17 degrees, low wind and cloud cover), and lasted 40 mins, giving a total of 2.6 survey hours per site in 2017 and 3.3 survey hours per site in 2018 (and a total of 100 survey hours across all sites and both years). We were unable to survey overstories comprised of tall trees which in our forest sites included the world’s tallest flowering plant, Eucalyptus regnans. Bees observed contacting flowers were captured with plastic jars and later identified to species by Michael Schwarz (Exoneura and Brevineura species) and Michael Batley (all other species). Plants were identified in the field with reference to the Yarra Ranges local plant directory ( Note that interactions between bees and native plants were recorded frequently on some farm sites, even though exotic plants dominated in the farm context.

Flower sampling

The flower community was sampled in each circular sampling zone during each bee survey period in 2018. A 100 x 1 m belt transect was divided up in each sample zone into sections proportional to the area covered by each patch type (e.g. native vegetation, apple rows, weedy/grassy areas), and the number of flowers (number of inflorescences in the case of Asteraceae species) of each species present in each transect section was counted. Native plants were included in surveys on some farm sites where they grew along creek-lines or roadsides. The densities of native flowers and Gondwanan flowers (GFD, the number of flowers belonging to plants with known Gondwanan origin according to Crisp and Cook (2013)) were summed across all surveys within each sample zone and divided by the sample area to provide a single density value for native flowers and for Gondwanan flowers at each site. Gondwanan flower density was not correlated with total flower density (i.e. the number of flowers of all species in each sample zone, Pearson’s correlation coefficient = 0.143, p = 0.557), native flower density (i.e. the number of flowers of all native species in each sample zone, Pearson’s correlation coefficient = 0.381, p = 0.108), or floral resource diversity (i.e. the number of plant species flowering within each sample zone, Pearson’s correlation coefficient = -0.245, p = 0.312), such that our measure of local Gondwanan flower density was independent of the overall level of floral resources, the level of native floral resources, and species richness within local plant communities.

Landscape sampling

We generated land cover maps in ArcGIS by drawing polygons corresponding to native vegetation (any area with an over-story of native trees, including Gondwanan and non-Gondwanan native plants), agriculture (cropland and open grassy areas of the agricultural matrix, and other cover types (water, exotic tree plantations or windbreaks, and houses and farm infrastructure) over the “Imagery” base map, and subsequently ground-truthing polygons (i.e. determining whether imagery matched existing cover) during site visits. The proportion of area covered by each cover type within 1000 m of each site in 2018 was then calculated using the ‘Spatial Analyst Tools’ in ArcGIS. The proportion of area covered by native vegetation (hereafter NV) and agriculture comprised 43% and 40%, respectively, of total land cover across sites, and NV was used in bee community modelling as it represents natural habitat for bees and in our study region was strongly negatively associated with agricultural cover (Pearson’s correlation coefficient = - 0.76, p = 0.000).

Phylogenetic tree construction

In the absence of a species-level phylogeny for the bee species we collected, we assembled a phylogenetic tree by combining a backbone phylogeny for bee sub-families (Cardinal and Danforth, 2013), pruned for the sub-families present in our study, with phylogenies for different clades from separate published dated phylogenies to produce a grafted tree with genus or sub-genus (depending on whether species from different sub-genera were detected) soft polytomies. Divergence times for genera and sub-genera in Halictinae (Homalictus, Lasioglossum Callalictus, Lasiolgossum Chilalictus, Lasioglossum Ctenonomia, Lasioglossum Parapshecodes, and Lasioglossum Pseudochilalictus) were obtained from Gibbs et al. (2012), for genera within Apidae (Exoneura and Brevineura) from Bousjein et al. (2021), and genera within the three Colletidae sub-families Euryglossinae (Euryglossa and Euhesma), Hylaeinae (Hylaeus and Amphylaeus), and Neopasiphaeinae (Leioproctus) from Almeida et al. (2012).

Biogeographic histories of bee lineages in the above phylogeny were also obtained from multiple sources. Ancestral range reconstructions and dated phylogenies place the origin of Colletidae in Australia or Australia plus South America (Almeida et al., 2012), or South America (Hedtke et al., 2013), 80-90 mya (Almeida et al., 2012, Cardinal and Danforth, 2013) when South America and Australia were still connected via Antarctica. The clade containing all three Colletidae subfamilies in the present study most likely arose in Australia 60-80 mya and diversified there, though with possible interchanges with South America until around 30 mya (Almeida et al., 2012, Cardinal and Danforth, 2013). We thus assume Colletidae linages in the present study were in Australia 60-90 mya. Halictidae, and sub-families Halictinae (containing Lasioglossum and Homalictus) and Nomiinae (containing Lipotriches), most likely arose in the Americas (Hedtke et al., 2013) 100 mya and 70 mya, respectively (Cardinal and Danforth, 2013). The ancestor of Lasioglossum and Homalictus arrived in Australia from Asia 20-30 mya (Danforth and Ji, 2001, Gibbs et al., 2012). The origin of Lipotriches in Australia is poorly understood, but outside Australia this genus is only present in Africa (where it is most diverse) and Asia (Michener, 2007), so we assume Lipotriches arrived in Australia 10-40 mya in line with arrival times estimated for other Australian bee linages arriving from Africa or Asia (Michener, 1979, Fuller et al., 2005, Trunz et al., 2016, Leys et al., 2002, Chenoweth and Schwarz, 2011). Finally, the ancestral range of Apidae is unresolved, though is unlikely to be Australia (Hedtke et al., 2013), and Exoneura species most likely arrived in Australia from Africa 30-40 mya (Chenoweth and Schwarz, 2011).

Statistical analyses

We used bee visitation data from 2017 and 2018 for plant-pollinator interaction networks, but only from 2018 for testing phylogenetic signal because floral abundance data was only recorded in 2018 and bee sampling effort was greater in this year. All statistical analyses were performed in the R statistical environment (R Core Team, 2013).

Plant-pollinator interaction network

We used network analyses to determine whether foraging patterns differed between ancient and more recently arrived bee lineages. We described foraging preferences of bees with three plant-pollinator interaction networks using flower visitor surveys, one network for all sites (AS), one network for farm sites (FS), and one network for native vegetation sites (NS). Bipartite plant-pollinator interaction networks were created by pooling flower visitor observations across all sample periods and sites in 2017 and 2018 for the AS network, and within each site category (i.e. farm vs native vegetation) for the FS and NS networks, to create matrices showing the number of times each bee species was recorded visiting each plant species. For the AS network we also analysed modularity, the extent to which the network can be divided into groups of species (modules) that interact more with each other than with species from other modules (Olesen et al., 2007). Modularity was analysed by creating a modular matrix version of the AS network and calculating a modularity index (Q) that measures the degree to which the network can be divided into modules, ranging from 0 to 1 (from less to more modular). The algorithm to compute Q is a stochastic process, so to stabilize modularity computation we re-ran the algorithm 50 times and used the modularity configuration with the highest Q (e.g. Carstensen et al., 2016). We then tested the significance of this Q value using the null-model approach described in Dormann and Strauss (2014). We generated 100 random networks with the same row and column totals as the AS network (i.e. interaction partner identities were randomized while abundance distributions of plants and pollinators were held constant), and calculated the number of standard deviations by which the observed AS network Q value exceeded the expected Q value from random networks, with > 2 standard deviations being considered significantly modular (Dormann and Strauss, 2014). All network analyses were performed using the package “bipartite” (Dormann et al., 2019).

Environmental and phylogenetic effects

We tested the effects of phylogenetic relatedness on bee responses to local Gondwanan flower density and the amount of native vegetation in the surrounding landscape in 2018 using Hierarchical Modelling of Species Communities (HMSC) implemented in the R package ‘Hmsc’ (Tikhonov et al., 2019). HMSC is a joint species distribution modelling approach implemented in the Bayesian framework with Markov chain Monte Carlo (MCMC) sampling that uses a hierarchical structure to model the responses of species to environmental variables as functions of traits and/or phylogenetic relationships (Ovaskainen and Abrego, 2020). This hierarchical approach is advantageous in situations where most species are rare (which is the situation in the present study) because the assumption of joint species responses (e.g. species with similar trait values respond similarly to the environment) facilitates model parameterization for rare species (Ovaskainen and Abrego, 2020). We modelled species presence/absence (rather than abundance) with a binomial error distribution to avoid problems arising from zero-inflation, given that 75% of cells in the site-by-species matrix contained zero.

We wanted to test the effects of three environmental variables on bee species composition using HMSC: log-transformed Gondwanan flower density (GFD), site type (farm vs native vegetation; ST), and native vegetation within 1000 m of sample locations (NV). GFD and NV were moderately correlated with each other (Pearson’s correlation coefficient = 0.507, p-value = 0.027), so to avoid issues of multi-collinearity we tested their effects in separate models (though report results from a model included NV and GFD in the supplementary material, Figures S2). This means their effects are not entirely independent of each other. Each model included species abundances as functions of one environmental variable included as a fixed effect, and a random effect for spatial location (latitude and longitude, to account for spatial autocorrelation in species responses), and a hierarchical layer with species environmental responses (beta parameters) varying as a function of phylogenetic relatedness (the phylogenetic tree with genus- or sub-genus-level polytomies) as a random effect.

We obtained posterior parameter estimates with 95% credible intervals (95% CIs) for species responses to the environmental variables (Betas), and for the effects of phylogeny (Rho, which takes a value between 0, indicating no phylogenetic signal, and 1 indicating Brownian motion model of trait evolution) on these species’ responses. Herein we refer to parameter estimates with 95% CIs not overlapping zero as providing strong support for an effect of the associated variable, since the interpretation of the 95% CI is that there is a 95% probability that the true parameter estimate lies within the interval (given the data), and is therefore not zero if the CI does not contain zero. We also report overall area under the receiver operator curve (AUC) values (AUC = 1 when a model perfectly discriminates presences and absences, and 0.5 when discrimination is as good as random) for HMSC models by averaging AUC across all species (Ovaskainen and Abrego, 2020). Four independent chains of 1000 samples each (i.e. 4000 recorded samples in total), with a thinning interval (the interval between recorded samples) of 100 and burn-in of 500 (recorded) samples was sufficient to obtain satisfactory model convergence for all parameters estimated in all models. Model convergence was assessed with the effective number of samples (which was close to the actual number of samples, indicating independence between consecutive recorded samples) and the potential scale reduction factor (which was close to one, indicating the four independent chains produced similar results and posterior is well sampled) (Ovaskainen and Abrego, 2020).

We calculated additional indices of phylogenetic signal because indices differ in their robustness to incomplete phylogenetic information and whether they are global or local measures of phylogenetic signal (Münkemüller et al., 2012, Keck et al., 2016). We calculated Abouheif’s Cmean as it is a well performing global measure under a range of conditions including incompletely resolved phylogenies (i.e. polytomies, such as our phylogeny) (Münkemüller et al., 2012). We also calculated local Moran’s I which gives information about the species contributing to phylogenetic signal (i.e. species exhibiting trait values that are similar to their close relatives/neighbors in the phylogeny) (Keck et al., 2016). We used Abouheif’s Cmean and local Moran’s I to detect species that were similar to their relatives in their responses to environmental variables (GFD, NV) using beta parameter estimates of species responses to GFD and NV extracted from joint species distribution models without any trait or phylogenetic effects (see above). We also used Abouheif’s Cmean and local Moran’s I to test for phylogenetic signal in two resource-use traits: 1) foraging preferences for plants with Gondwanan ancestry, defined for each bee species as the proportion of visits that were made to Gondwanan flowers across all sites and sample periods in which the species was observed (herein referred to as Gondwanic resource use GRU, not to be confused with GFD); and 2) the specialization index (d’), a measurement of how specialized each bee species is in relation to the available plant species, calculated from the AS network using the ‘bipartite’ package. We calculated Abouheif’s Cmean and local Moran’s I using the ‘phylosignal’ package (Keck et al., 2016) and null hypothesis significance tests were made using randomization tests.

We also performed a complementary test of the prediction that more ancient Australian bee lineages will be more negatively impacted by the decline of ancient Gondwanan plants, using a different HMSC model formulation. Instead of testing phylogenetic signal and interpreting it in light of biogeographic history, we treated biogeographic history as a species trait. We tested the effect of order of bee lineage arrival in Australia – a trait with three levels; first to arrive (Colletidae species), second to arrive (Apidae species), and third to arrive (Halictidae species) – on bee species responses to Gondwanan flower density. Parameter estimation and checks of model fit and convergence were all performed as described above.     

Prior to HMSC analysis and calculations of Abouheif’s Cmean and local Moran’s I we removed bee species for which only one individual was observed, to reduce the influence of extremely rare species.


Australian Government Department of Agriculture, Water and Environment