Skip to main content

A global map of microbial residence time

Cite this dataset

He, Liyuan; Xu, Xiaofeng (2021). A global map of microbial residence time [Dataset]. Dryad.


Soil microbes are the fundamental engine for carbon (C) cycling. Microbial residence time (MRT) therefore determines the mineralization of soil organic C, releasing C as heterotrophic respiration and contributing substantially to the C efflux in terrestrial ecosystems. We took use of a comprehensive dataset (2627 data points) and calculated the MRT based on the basal respiration and microbial biomass C. Large variations in MRT were found among biomes, with the largest MRT in boreal forests and grasslands and smallest in natural wetlands. Biogeographic patterns of MRT were found along climate (temperature and precipitation), vegetation variables (root C density and net primary productivity), and edaphic factors (soil texture, pH, topsoil porosity, soil C, and total nitrogen). Among environmental factors, edaphic properties dominate the MRT variations. We further mapped the MRT at the global scale with an empirical model. The simulated and observed MRT were highly consistent at plot‐ (R2=0.86), site‐ (R2=0.88), and biome‐ (R2=0.99) levels. The global average of MRT was estimated to be 38 (±5) days. A clear latitudinal biogeographic pattern was found for MRT with lower values in tropical regions and higher values in the Arctic. The biome‐ and global‐level estimates of MRT serve as valuable data for parameterizing and benchmarking microbial models.


Data sources

This study was based on the soil microbial metabolic quotient dataset in Xu et al. (2017), which synthesized data spanning from 1970 to 2016. In this study, we further updated that dataset to 2020. The same criteria for data compilation in Xu et al (2017) have been applied to update the dataset in this study. Specifically, we searched publications in Google Scholar ( using the keyword combinations of “basal respiration”, “soil microbial biomass”, “soil microbial turnover rate”, “soil microbial metabolic quotient”, and “soil microbial residence time”. We screened the papers via following criteria: 1) both soil basal respiration and microbial biomass C were reported; 2) any of soil microbial turnover rate, soil microbial metabolic quotient, and MRT estimated based on basal respiration rate in lab conditions was clearly reported; 3) no contamination and disturbance occurred during sampling; and 4) lab incubation for basal respiration is less than 40 days as long incubation experiments may cause a shift in microbial community, which does not represent MRT in the sampled soils.

Collectively, the final dataset included 2627 observations retrieved from 232 papers, covering 9 biomes (i.e., boreal forest, temperate broadleaf forest, temperate coniferous forest, tropical/subtropical forest, grassland, shrubland, bare soils/desert, natural wetlands, and cropland) (Fig. 1). Cropland, temperate broadleaf forest, grassland, and temperate coniferous forest accounted for approximately 46%, 13%, 11%, and 9%, respectively, whereas all other biomes combined accounted for 21% of the whole dataset. The majority of the field sites are located in Europe, Asia, and North America, whereas a relatively small number of observations are from South America, Africa, Australia, and Antarctica. For data points without coordinate information reported, we searched the geographical coordinates based on the names of the study site, city, state, and country. The geographical information was further used for locating the sampling points on the global map to extract climate, edaphic properties, vegetation productivity, and soil microclimate long-term data from global datasets (Xu et al. 2017).

Climate, edaphic, vegetation, and microbial data

Climatic, edaphic, vegetation, and microbial variables were not fully reported in published studies, we extracted such variables from global datasets following our previous studies (Xu et al. 2013, Xu et al. 2017, Guo et al. 2020, He et al. 2020). For climatic variables, we extracted mean annual temperature (MAT) and mean annual precipitation (MAP) during 1970-2000 from the WorldClim database version 2 with the spatial resolution of 30 seconds ( In addition, we obtained monthly and annual mean soil moisture (SM) and soil temperature (ST) of top 10 cm during 1979-2018 from the NCEP/DOE AMIP-II Reanalysis ( We also obtained the data of soil pH and soil texture (i.e., sand, silt, and clay) from the Harmonized World Soil Database (HWSD, at a spatial resolution of 0.05° × 0.05°. Soil bulk density (BD), soil C, and total (TN) were extracted from the IGBP-DIS dataset (IGBP,, at a spatial resolution of 0.5′ × 0.5′. Root C density (Croot) data were extracted from the global dataset of a 0.5-degree resolution based on observational data (Gibbs and Ruesch 2008, Song et al. 2017). We extracted topsoil porosity data from a global dataset produced by Global Land Data Assimilation System (GLDAS, at a spatial resolution of 0.25° × 0.25°. Annual net primary productivity (NPP) for the period of 2000-2015 was obtained from the MODIS gridded dataset with a spatial resolution of 30 seconds ( Soil microbial biomass C (MBC) and nitrogen (MBN) were retrieved from a compiled global soil microbial biomass C and nitrogen (N) dataset archived at Oak Ridge National Laboratory (Xu et al. 2015b).

The auxiliary datasets used included the global land area database and global vegetation distribution dataset. The global vegetation distribution dataset was obtained from a spatial map of 11 major biomes: boreal forest, temperate forest, tropical/subtropical forest, mixed forest, grassland, shrubland, tundra, desert, natural wetlands, cropland, and pasture, which have been used in our previous publications (Xu et al. 2013, Xu et al. 2017, Guo et al. 2020, He et al. 2020). The global land area database was from the surface data map of 0.5° × 0.5° generated for E3SM (

To generate the global map of MRT, the global datasets of varied spatial resolutions were resampled to 0.5 degree using the “bilinear” algorithm. For datasets formatted as NetCDF, we performed the interpolation using the function of “linint2_Wrap” in NCAR Command Language (Version 6.3.0). For datasets in other formats, the interpolation was conducted using the platform of ArcGIS 10.6 (Esri, Redlands, CA, USA). 

Temperature correction for lab incubations

Soil basal respiration is defined as the steady rate of respiration in soil, which originates from the mineralization of organic matter (Bloem et al. 2005). The temperature response of basal respiration follows the exponential function (Moyano et al. 2007). The sensitivity of microbial respiration to temperature is commonly described by Q10, a factor by which carbon dioxide (CO2) production increases with a 10°C increase in temperature. Under steady‐state conditions, soil microbial biomass does not change over a long term. The specific growth rate of soil microbial community is equivalent to microbial biomass turnover rate, corresponding to its inverse as soil microbial biomass residence time as below,

equation 1)

where MRT is the microbial residence time, MBC is microbial biomass C, and BR is the basal respiration rate. 

Due to the differences between lab incubation temperature and in situ soil temperature, temperature correction is necessary for comparing estimated MRT across studies in a quantitative manner. We adjusted the reported basal respiration to their long-term (1979-2018) average ST following the equation 2. This function has been previously used to mathematically simulate the temperature dependence of microbial respiration (Rey and Jarvis 2006, Wei et al. 2014). The corrections were performed under the assumption that basal respiration is temperature dependent, while soil microbial biomass remains unchanged during the typically short soil incubations.

equation 2)

where T1 and T2 are temperatures in Celsius,  is basal respiration at a given temperature of T2,  is the estimated basal respiration at T1, and Q10 is the temperature sensitivity parameter.

Temperature sensitivity of Q10 is an important parameter in modeling temperature effects on basal respiration. In the past several decades, Q10 has been extensively investigated. Experimental studies ubiquitously indicated large spatial heterogeneity of Q10. It has been found that Q10 is not a constant, the reported Q10 values were different among soils and ecosystems (Davidson et al. 1998, Wang et al. 2019). Despite the uncertainties in Q10 values, a fixed Q10 of 2.0 has gained wide acceptance in modelling ecosystem respiration responses to climate change (Sistla et al. 2014, Xu et al. 2014). Although the Q10 values are commonly reported as 2.0, the reported values varied among studies, ranging from 1.4 to 2.6 (Mahecha et al. 2010, Wang et al. 2010, Hamdi et al. 2013, Wang et al. 2019, Li et al. 2020). To fully consider the variations in reported Q10 values among studies, we therefore selected seven Q10 values (i.e., 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, and 2.6) with an interval of 0.2 within 1.4-2.6 centered around 2.0 to calibrate basal respiration from lab incubation temperature to in situ soil temperature.

In the dataset, there were seventeen studies without explicit incubation temperature indicated. The ISO 16072 (2002)recommends an incubation temperatures range of 20-30°C. The incubation temperature is closely associated with Q10values, and the Q10 value of 25°C was proved to be a threshold incubation temperature for smaller variations in Q10values. A significant decrease occurs in Q10 values when temperature was less than 25°C. When incubation temperature was above 25°C, the mean Q10 remained relatively constant (Wang et al. 2019). Therefore, for studies without incubation temperature reported, we performed the temperature correction for lab incubations assuming an incubation temperature of 25°C.

Model selection

The MRT exhibited clear biogeographic patterns, indicating the important role of environmental factors on MRT distribution (Fig. S1-6Fig. 2). Therefore, we created a generalized linear model to quantify the independent and interactive impacts of soil microbes (MBC and MBN), climate (MAP and MAT), soil microclimate (ST and SM), vegetation (NPP and Croot), and edaphic properties (silt, sand, soil pH, BD, topsoil porosity, soil C, and TN) on the MRT.

Based on the generalized linear model, we further built an empirical model for the mean MRT by selecting the most important factors in explaining the variation in the mean MRT. To identify the most important factors in explaining the variation in the mean MRT, we repeatedly removed the least important variables (<0.1%) from the generalized linear model. Finally, we selected 23 most important variables in explaining the variations in mean MRT. In addition, we randomly splitting the dataset to two portions. A portion (75%) of data were used to train the model; and other 25% was used for model validation. The selected empirical model explained 32% of the variation in mean MRT, and it had the formula: log10 (MRT) = -1.529 - 0.04866 * MAT + 0.01663 * soil C + 3.04 * topsoil porosity + 0.01047 * sand - 0.01197 * pH + 0.1618 * Croot + 0.0774 * BD - 0.01122 * ST - 0.00003072 * sand * NPP - 0.3789 * Croot * topsoil porosity + 2.061 * BD * SM + 0.01182 * MAT * pH - 0.001064 * MAT * Croot + 0.0007919 * Croot * MBN + 0.001077 * sand * Croot - 0.0001516 * sand * ST + 0.002041 * NPP * topsoil porosity + 0.0000003703 * NPP * BD * MAP - 0.000002451 * topsoil porosity * MAP * MBC - 0.002437 * MAT * SM * silt + 0.001634 * MAT * SM * MBN - 0.00002335 * Croot * MAP * SM - 0.00005116 * MAT * NPP * topsoil porosity.

After the model was developed, we used 25% of the data that were not used in model development to validate the model, and we found a significant consistency between model prediction and observed data (Fig. S7). We generated the global map of mean MRT by applying the empirical model and the related global maps of biotic and environmental variables (Fig. 3). Given the large uncertainties in MRT for desert and natural wetland soils, we excluded deserts and natural wetlands from efforts in mapping, uncertainty analysis, and biome-level comparison. To guarantee the feasibility of the simulated MRT, we used the 95% confidence interval of the synthesized dataset to confine the simulated value in the global map of MRT. To test the accuracy of MRT simulated in the global map, we compared the modeled results against the observed data at multiple levels (i.e., plot-, site-, and biome-levels) (Fig. 4). 

Uncertainty analysis

To estimate the parameter-induced uncertainties in MRT distribution, we used an improved Latin Hypercube Sampling (LHS) approach to quantify variations in MRT. The LHS approach is able to randomly produce an ensemble of parameter combinations with a high efficiency. This approach has been widely used to estimate uncertainties in model outputs (Haefner 2005, Xu 2010, Xu et al. 2014). Specifically, we assumed all parameters of the empirical model followed a normal distribution. Then, we used the LHS algorithm to randomly select an ensemble of 3000 parameter sets for variables listed in Table S1 using the function of improvedLHS in the R package “lhs” (Carnell 2019). Next, we computed the inverse of the standard normal cumulative distribution of 3000 parameter sets using norminv function in MATLAB 2018b (The MathWorks Inc., Natick, Massachusetts, USA). Finally, we calculated the biome-averages and corresponding 95% confidence intervals of MRT for reporting (Table 2).

Statistical analysis

We first tested the normality of data distribution using the function of shapiro.test in the R package “stats” (R Core Team 2018). Due to the violation of normality, we performed a base 10 logarithm transformation for MRT corrected to long-term ST using seven Q10 values. Therefore, the log-transformed MRT using multiple Q10 values were used for comparison among biomes. The mean and 95% confidence boundaries of MRT were transformed back to the original values for reporting (Table 1). For the investigation of the biogeographic pattern, the identification of environmental controls, and the selection of the empirical model for MRT, we used the mean of MRT calibrated with seven Q10 values for data analysis. To create the generalized linear models for quantifying the environmental controls and building the empirical model (Fig. 2Table S1), we constructed the generalized linear model using the function of glm in the R package “stats” (R Core Team 2018). We used Akaike information criterion as a model selection criterion. Before conducting the generalized linear model, we tested the multicollinearity for the variables within and among each variable group, i.e., climate, soil microclimate, edaphic properties, vegetation, and soil microbes, and we found no significant multicollinearity (variance inflation factor < 5). In addition, a structural equation model was built to depict the direct and indirect effects of environmental factors on mean MRT (Fig. S6). The structural equation model was constructed using R package “lavaan” (Rosseel 2012). All statistical analyses were performed and relevant figures were plotted using “agricolae” (de Mendiburu 2019), “multcomp” (Hothorn et al. 2016), “soiltexture” (Moeys 2018), “VennDiagram” (Chen and Boutros 2011), “ggplot2” (Wickham et al. 2016), and “basicTrendline” (Mei et al. 2018) packages in R version 3.5.3 for Mac OS X ( Fig. 1 and Fig. 3 were produced with NCAR Command Language (version 6.3.0) and ArcGIS (version 10.6), respectively.


San Diego State University

CSU Program for Education & Research in Biotechnology

CSU Program for Education & Research in Biotechnology