# Data from: Odonate species occupancy frequency distribution and abundance – occupancy relationship patterns in temporal and permanent water bodies in a subtropical area

## Citation

Renner, Samuel et al. (2021), Data from: Odonate species occupancy frequency distribution and abundance – occupancy relationship patterns in temporal and permanent water bodies in a subtropical area, Dryad, Dataset, https://doi.org/10.5061/dryad.d51c5b008

## Abstract

This paper investigates species richness and species occupancy frequency distributions (SOFD) as well as patterns of abundance-occupancy relationship (SAOR) in Odonata (dragonflies and damselflies) in a subtropical area. A total of 82 species and 1983 individuals were noted from 73 permanent and temporal water bodies (lakes and ponds) in the Pampa biome in southern Brazil. Odonate species occupancy ranged from 1 to 54. There were few widely distributed generalist species and several specialist species with a restricted distribution. About 70% of the species occurred in less than 10% of the water bodies, yielding a surprisingly high number of rare species, often making up the majority of the communities. No difference in species richness was found between temporal and permanent water bodies. Both temporal and permanent water bodies had odonate assemblages that fitted best with the unimodal satellite SOFD pattern. It seems that unimodal satellite SOFD pattern frequently occurred in the aquatic habitats. The SAOR pattern was positive and did not differ between permanent and temporal water bodies. Our results are consistent with a niche-based model rather than a metapopulation dynamics model.

## Methods

#### 2.1 Data, methods and study area

Originally, most of the Brazilian Pampa biome (Fig 1) consisted of grassland (hence, the Pampa is treated by many authors as the “Southern Fields”), and of sparse shrub and forest vegetation (Overbeck et al., 2009). Many areas within this biome have, however, been changed by human activities, mainly agriculture, cattle farming and silviculture (Baldi & Paruelo, 2008; Overbeck et al., 2013). This biome is still one of the least protected in Brazil: Oliveira et al. (2017) notes that only 0.8% of the Pampa is protected.

Data from 53 permanent and 20 temporal lakes and ponds (water bodies) was used (Figure 1; Renner et al., 2018; 2019). We sampled adult dragonflies from March 2011 to April 2017, visiting the localities from one (temporary waters) up to seven times during this period. We followed the method described by Renner et al., (2018; 2019; cf. original publications for more detailed information), collecting dragonflies on sunny days during the peak period of odonate activity (between 09:00 h to 16:00 h). Two persons using hand-held insect nets walked along the perimeter of the sites, along the water edges and marginal zones. The average time spent at each sampling site was 45 min. This sampling method is opportunistic, and although its efficiency is constant, the probability of detecting the rarest species is reduced. Several papers discuss the problem of detecting all species at a given site (e.g., Hedgren & Weslien, 2008; Raebel et al., 2010; Bried et al., 2012a; 2012b; Hardersen et al., 2017), highlighting the importance of detecting also rare species (Cao et al., 1998). Mao & Colwell (2005) pointed out that there is only a small chance to detect the rarest species at a site, but that modern modelling approaches combined with iterative sampling seems to be a way forward (Young et al., 2019). In order to ascertain the species occupancy relationship patterns, it is crucial to show whether the number of rare species (satellite species; see below) is high or low. The impact on the results of a possible under-estimate of the number of rare species due to incomplete sampling is addressed in the discussion.

Another limitation of our method is that temporary waters cannot be sampled repeatedly over a number of months (as they dry out). Although most of our permanent sites were visited repeatedly, it is therefore impossible to test for temporal variation among our samples. Renner et al. (2016) showed, for a smaller dataset within the same area, that although some of the species were seasonal, the species composition remained relatively similar throughout the year (Sørensen index 0.73–0.83).

#### 2.2 Statistical methods

Although blunt compared to more complicated hierarchical multispecies models (Iknayan et al., 2014), our opportunistic data was better suited for the traditional jackknife method to estimate species richness data in temporal, permanent and combined water bodies, and a 95 % confidence interval was applied to each type of water body separately (see more details in Krebs, 1999). The jackknife species richness estimate builds on the frequency of rare species observed within the community. Here each odonate species was recorded as present (1) or absent (0) in each water body. We also calculated the number of unique species, defined as occurring in only one water body. We used the equation by Heltsche & Forrester (1983) to calculate the estimated species richness:

Ŝ = s + ((n-1)/n)*k

Where Ŝ = Jackknife estimate of species richness, s = Observed total number of species present in n water bodies, n = Number of water bodies in total and k = Total number of unique species in n water bodies. For each water body we also estimated the species richness with Chao1 method in R-package ‘vegan’ v.2.4-2.

We used the Moran I index to test spatial autocorrelation between the faunistic similarity and the geographical distance between water bodies. We used both Jaccard dissimilarity and Bray dissimilarity, which based on the abundance of each individual species, indexes as a distance measure of odonate species dissimilarity and community dissimilarity, respectively. The Euclidean distance (in km) was used for geographical coordinates of the water bodies. The Mantel test was calculated with R-package ‘vegan’ v.2.4-2., and the statistical significance was estimated running 999 permutations. As spatial autocorrelation would nullify or affect the results of the Mantel test, we tested the spatial independence of species composition at the 73 sampling sites using a Moran’s I analysis. We used individual species occurrences as variables in a Principal Component Analysis (PCA), where the first axis was used as response variable to the Moran’s I with coordinate variables for ten different distance classes. The global Moran’s I analysis detected no significant spatial structure of the species composition for any distance class (minimal distance class average: 0.148 degree; Moran’s I = 0.018; p = 0.059). Hence, we can rely on the results of the Mantel test.

Following McGeoch and Gaston (2002), we used classes of 10% occupancy, and the number or percentage of odonate species in each class, to demonstrate the variation in occupancy frequency distribution between temporal and permanent water bodies (see also Korkeamäki et al., 2018). We also tested the relationship between water body area (m^{2}), length of shoreline (m) and species richness, using Pearson correlation with a log_{10} transformation to compensate for large differences in size.

We used the same approach as Korkeamäki et al. (2018), where the multi-model inference approach was applied to regressions of ranked species-occupancy curves (RSOCs as in Jenkins (2011). The three data sets (using temporal, permanent, and combined data; species in rows and water bodies in columns), were processed separately based on occupancy (presence/absence) data for individual water bodies. First, we calculated the proportion of water bodies occupied by each species (occupancy frequency) using the sum of water bodies. Second, we divided the occupancy frequency of each species by the number of water bodies, resulting in the number (relative proportion) of water bodies occupied by each species. In the following step, we arranged the species in decreasing order according to their relative occupancy values, setting R_{i} as the rank value for species i, from which we plotted the relative occupancy of species (O_{i}) as functions of R_{i} (RSOC). Finally, we evaluated whether a unimodal-satellite dominant, a bimodal symmetrical, a bimodal asymmetrical or a random pattern best fitted our odonate community (cf., Jenkins, 2011). We used the IBM SPSS statistical package version 23 for all statistical calculations. As in Jenkins (2011), the Levenberg–Marquardt algorithm (with 999 iterations) was used for the nonlinear regressions, estimating the parameters (y_{0}, a, b and c) of the following four equations (by means of ordinary least squares (OLS)) to find the best fitting SOFD pattern. The equations are:

- O
_{i}= y_{0}+ a*exp(-bRi) with initial parameters y_{0}= 0.01, a = 1.0, b = 0.01; Unimodal-satellite mode (exponential concave) pattern. - O
_{i}= a/(1 + exp (-bRi + c), initial parameters a = 1.0, b = -0.1, c = -1.0; Bimodal symmetrical (sigmoidal symmetric) pattern. - O
_{i}= a[1 - exp (-bRi^{c})], initial parameters a = 1.0, b = -1.0, c = -1.0; Bimodal asymmetric (sigmoidal asymmetric) pattern. - Oi = aRi + b, initial parameters a = 0.01, b = 0.01; Uniform (random) pattern.

We also examined the regressions graphically for homogeneity of variance, normality of residuals and independent error terms, as well as the tails and shoulders of the data and models (see more details in Jenkins 2011; Korkeamäki et al., 2018).

The Akaike Information Criterion for small sample sizes (AICc) was used to compare the four alternative models, where the one with the smallest AICc would be best, based on the Kullback-Leibler distance (Burnham and Anderson 2000). This approach works well to detect differences between models when values for ΔAICc (= AICc_{min} - AICc_{i}) are higher than 7 (Anderson et al., 2000; Burnham & Anderson, 2000; Burnham et al., 2011; Jenkins, 2011).

We used a Generalized Linear Model (GLM) to investigate the relationships between number of individuals and occupancy frequency (independent variable) in the SAOR-model. The model type was negative binomial distribution with log link (type III errors) (O’Hara & Kotze, 2010). Habitat preference was divided into three categories: species observed only in (i) temporary water bodies, 9 species, (ii) permanent water bodies, 34 species and (iii) both types of habitats (hereafter generalist species), 39 species. In order to test differences in occupancy frequency and number of individuals between three habitat preference categories of odonate species, we applied the Generalized Linear Model with negative binomial distribution; log link (type III errors), using habitat preference as a factor.

## Funding

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, Award: 88881.068147/2014-01