Skip to main content

Soil microbial diversity and community composition during conversion from conventional to organic agriculture

Cite this dataset

van Rijssel, Sophie et al. (2022). Soil microbial diversity and community composition during conversion from conventional to organic agriculture [Dataset]. Dryad.


It is generally assumed that the dependence of conventional agriculture on artificial fertilizers and pesticides strongly impacts the environment, while organic agriculture relying more on microbial functioning may mitigate these impacts. However, it is not well known how microbial diversity and community composition change in conventionally managed farmers’ fields that are converted to organic management. Here, we sequenced bacterial and fungal communities of 34 organic fields on sand and marine clay soils in a time series (chronosequence) covering 25 years of conversion. Nearby conventional fields were used as references. We found that community composition of bacteria and fungi differed between organic and conventionally managed fields. In the organic fields, fungal diversity increased with time since conversion. However, this effect disappeared when the conventional paired fields were included. There was a relationship between pH and soil organic matter content and the diversity and community composition of bacteria and fungi. In marine clay soils, when time since organic management increased, fungal communities in organic fields became more dissimilar to those in conventional fields. We conclude that conversion to organic management in these Dutch farmers’ fields did not increase microbial community diversity. Instead, we observed that in organic fields in marine clay when time since conversion increased soil fungal community composition became progressively dissimilar from that in conventional fields. Our results also showed that the paired sampling approach of organic and conventional fields was essential in order to control for environmental variation that was otherwise unaccounted for.


Site selection

We established two time series (chronosequences), one on sand and one on marine clay soils, of arable fields that have been converted from intensive conventional arable farming to organic arable farming. The chronosequences ranged from fields that had been converted 0 to 25 years ago. Each organic field was paired with a nearby conventional field that fit the selection criteria in order to obtain a control that was subjected to the same local variation in environmental factors. This resulted in a total of 11 pairs of organic and conventional fields on sandy soil and 23 pairs of organic and conventional fields on marine clay soil, so in total 68 fields were sampled. Organic fields all had a SKAL certificate (“Stichting Keur Alternatief voortgebrachte Landbouwproducten” which is translated as “Control authority for organic production”), based on European legislation. This includes that minimally 70% of the fertilizer is certified organic animal manure, plant materials or compost. Furthermore, there are restrictions in the use of mineral fertilizers and pesticides.

We had four selection criteria: (1) Standardized soil type: sandy soils were Anthrosols with a very low elutriable fraction and an A-horizon of at least 30 cm, whereas marine clay soils were Fluvisols from marine origin with an elutriable fraction of 17.5-45%. (2) Standardized crop: the fields must be covered by a monocotylous crop, preferably winter wheat (Triticum aestivum), and if not possible, a grass-clover mixture or luzern. (3) Crop rotation: each field was part of a wider crop rotation with tuber crops (potatoes/onions). (4) Soil tillage: inversion tillage must have been applied at least once in the last 5 years. Locations are added in the supplement in a way that respects the privacy of the farmers involved in this research (Fig. S1). Within pairs, samples were taken minimally 40 meters apart (in the case of neighboring fields) and maximally 13 kilometers. The maximal distance between the most extremely situated fields within soil type was about 140 kilometers.

Soil sampling and analysis of chemical properties

Within each field we collected soil samples from three subsites that were situated minimally 30 m from the field edge and approx. 2 m from tractor tracks. Subsites were minimally 15 m apart. At each sampling subsite, we collected a soil sample of appr. 3 kg from 5 - 15 cm depth. The topsoil was removed because processes in the top soil may be largely driven by variations in daily weather conditions. Soil samples were stored at 4°C. Within a week soil samples were gently homogenized and a subsample of 6 g was sieved using a 4 mm mesh to remove coarse elements. From this subsample, we stored 1 g of soil in an Eppendorf tube at -80°C for analysis of the soil microbial community composition. The remaining soil was stored at field moisture at 4oC. From this soil we used a subsample to determine soil organic matter (SOM) content by loss-on-ignition. Samples were dried at 105°C and then placed in a muffle furnace for 8 hours at 430°C. soil organic matter content was calculated as the weight difference between samples heated at 105°C and 430°C, respectively. Another subsample was used to determine pH by shaking 10 g dry weight equivalent fresh soil in 25 ml of demi water for 2 hrs. at 250 rounds/minute (Mettler Toledo, Ohio, USA). Chemical analyses were done within 2 months after collection.

DNA extraction and amplification

To identify the alpha and beta diversity of bacteria and fungi, DNA was extracted from 0.25 g bulk soil using the DNeasy PowerSoil Kit (Masella et al., 2012) (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. Community composition of bacteria was determined by amplifying the V4-region of the 16S rRNA gene using 515f and 806rbc primers (Caporaso et al., 2012) and the fungal community composition was determined by targeting the ITS2 region using ITS4 (reverse) and fITS9 (forward) primers (Ihrmark et al., 2012; White et al., 1990). Polymerase Chain Reactions (PCRs) were performed in reaction mixtures containing 15.6 µl MQ, 0.15 µl Fast Start High Fidelity PCR System (Roche), 2.5 µl FastStart High Fidelity Reaction Buffer with 18 mM MgCl2 (10x concentrated), 2.5 µl 2 mM dNTP’s, 1 µl 25 mM MgCl2 Stock Solution, 1.25 µl BSA (4 mg/ml) and 0.5 µl of forward and 0.5 µl of tagged reverse primer and 1 µl genomic DNA. The PCR conditions were as follows: an initial denaturation step of 94oC for 5 min, 35 cycles of 45 s at 94oC, 60 s at either 51oC (16S) or 54oC (ITS) and 90 s at 72oC, followed by a final extension step for 10 min at 72oC. We purified the PCR products using Agencourt AMPurebeads (Beckman Coulter, Indianapolis, IN, USA) using a ratio of 1:0.7 of PCR product to bead volume. The purification was carried out according to the manufacturer’s protocol and purified products were diluted in 30 μl MQ-water. We then measured the concentrations of the purified PCR products with a fragment analyser (Advanced Analytical, Ankeny, IA, USA) using the standard sensitivity NGS fragment analysis kit (Advanced Analytical). The products were mixed in equal nano g quantities and sent to BGI (Shenzhen, China) for 250 bp paired-end sequencing with Illumina MiSeq.

Sequence processing

After sequencing, the raw reads were filtered. Data filtering includes removing adaptor sequences, contamination and low-quality reads from raw reads. This was done by the sequencing company BGI (Shenzhen, China). The 16S rRNA gene amplicon reads were analyzed using the dada2 pipeline (version 1.18) (Callahan et al., 2016) using truncLen 250 for forward reads and 200 for reversed reads (maxN=0, maxEE=c(2,2), truncQ=2, rm.phix=TRUE). Consensus method was used to remove chimeric sequences. Taxonomy was assigned using the SILVA SSU database (version 138) (Quast et al., 2012).

ITS sequences were analyzed using the PIPITS pipeline (version 2.4, standard settings) (Gweon et al., 2015) with VSEARCH implemented to pair sequences. The ITS2 region was extracted using ITSx (Bengtsson‐Palme et al., 2013) and short reads (<100bp) were removed. Chimeric sequences were removed by comparing with UNITE UCHIME database (version 8.2) (Edgar et al., 2011). Sequences were aligned and classified using with RDP against the UNITE fungal database (Kõljalg et al., 2013; Nilsson et al., 2019).

Sequence data preparation

Low abundant Amplicon Sequence Variants (ASVs) with 3 or fewer reads were removed as they are potentially sequencing artefacts (Auer et al., 2017). As well, we removed ASVs that occurred in < than 3 samples. In 191 out of 204 16S and in 183 out of 195 ITS samples were included in downstream analyses.

Data analyses

We analyzed the original number of reads, diversity and composition of bacterial and fungal data separately. For each sample we calculated the original number of reads before rarefying or normalizing. To test whether number of reads, ASV richness and Shannon diversity were affected by soil type (sand vs. marine clay) and management (organic vs. conventional) we first calculated the average of each field, and used that as a dependent variable in the linear models. Similar models were applied to describe pH and soil organic matter content differences between soil types and management.

Before calculating Observed ASV richness and Shannon diversity index we removed non-targeted ASVs and rarefied the data (16S to 5000 reads per sample; ITS to 1500 reads per sample). Rarefaction curves of 16S and ITS are presented in Fig. S2. We checked our rarefaction using bootstrapping by running the rarefaction 1000 times before testing the effect of management and time on richness and diversity for all groups. This resulted in the same results, or at least the same trend, in 97-100% of the rarefactions, indicating that the results we obtained were quite robust. For compositional analysis we first normalized the 16S and ITS data using cumulative sum scaling. We used principal coordinate analysis (PCoA) based on the Bray-Curtis dissimilarity matrix to perform unconstrained ordination. This was done on individual sample level, but then we averaged the PCoA coordinates at the field level. All subsequent analyses were run at field level, were we used PCoA axes as response variables.

To test differences in microbial community composition between regions (Zeeland and Flevoland for marine clay and Drenthe/Overijssel, Gelderland and Brabant/Limburg for sand; samples were located within or on the border of these Dutch provinces) using analyses of variance (ANOVA’s) and a Tukey post-hoc test on both PCoA axes. However, since each region included both older and younger organically managed fields we did not do separate analyses per region. Instead, we performed separate analyses for marine clay and sandy soils and for the models run on marine clay soils we included region as a random factor. Within sand we did not find differences in composition between regions.

We used linear mixed models (marine clay) and linear models (sand) to test the relationship between alpha or beta diversity and time since conversion, pH and soil organic matter content using management as covariable. To control for potentially unobserved factors underlying time since transition, we assigned the same time since conversion of organic farms to their paired conventional farm. In addition, we constructed a linear mixed effects model with pH and soil organic matter content as explanatory variables of microbial richness and diversity. Here we used forward model selection based on AIC (Akaike, 1974) to determine which factors to include in the model making the models not more complex than needed.

 To test if beta diversity between organic and conventional farms increased over time, we also calculated the average Bray-Curtis dissimilarity within each pair of organic and conventional fields. We then used the average dissimilarity as a response, and time as an explanatory variable in a linear model.

We tested the correlations between time since conversion and pH and soil organic matter for both organic and conventional fields. We assigned the same time since conversion of organic farms to their paired conventional farm. We assessed this by linear mixed effects models (marine clay) and linear models (sand) with time as explanatory variable with region as random factor (marine clay).

All statistical analyses were performed in R, version (version 4.1.1) (R Core Team, 2021). We used the “phyloseq” package (version 1.36.0) for filtering and rarefying our data (McMurdie & Holmes, 2013) and the metagMisc package (version 0.0.4) (Mikryukov, 2019) and ‘metagenomeSeq’ package (version 1.34.0) for normalization (Paulson et al., 2013). Bootstrapping was performed using a “for loop”; running 1000 times the rarefaction and subsequent analyses; R code can be found in Supplementary Materials. Linear (mixed) models were performed using the package “nlme” (Pinheiro et al., 2021). We used Type I Analysis of Variance. PCoA axis scores were performed using the ape package (version 5.5) (Paradis et al., 2004). Figures were made with “ggplot2” (version 3.3.5) (Wickham, 2016)

Usage notes

Microsoft Excel



Dutch Research Council, Award: ALWGR.2015.5