Mutualism mediates legume response to microbial climate legacies
Data files
Oct 14, 2025 version files 6.69 MB
-
Climate_legacy_symbiosis.Rmd
71.06 KB
-
Condition_dependency_of_drought_microbiome.csv
11.31 KB
-
field2funrhizo_merged_table.qza
283.36 KB
-
field2nod_merged_table.qza
1.33 MB
-
merged_metadata_symbiosis_plants.tsv
236.51 KB
-
merged-rooted-tree-fungi.qza
180.26 KB
-
merged-taxonomy-bacteria.qza
1.57 MB
-
merged-taxonomy-fungi.qza
221.93 KB
-
merged-unrooted-tree-bacteria.qza
1.17 MB
-
metadata_symbiosis_plants.tsv
73.61 KB
-
README.md
21.71 KB
-
rooted-tree-bacrhiz.qza
248.24 KB
-
rooted-tree-funrhiz.qza
64.71 KB
-
rooted-tree-nod.qza
44.68 KB
-
symb_leaf_no_over_time.csv
7.98 KB
-
Symbiosis_data.csv
102.89 KB
-
table-bacrhiz-noplant.qza
381.05 KB
-
table-funrhiz-noplant.qza
92.12 KB
-
table-nod-noplant.qza
81.92 KB
-
taxonomy-bacrhiz.qza
351.64 KB
-
taxonomy-funrhiz.qza
70.66 KB
-
taxonomy-nod.qza
69.18 KB
Abstract
Climate change is altering both soil microbial communities and the ecological context of plant-microbe interactions. Predicting how soil microbes modulate plant resilience to climate change is critical to mitigating the negative effects of climate change on ecosystems and agriculture. Previously, it was demonstrated that heat, drought, and their legacies altered soil microbiomes and potential plant symbionts. In this study, we conducted growth chamber experiments to isolate the microbially-mediated indirect effects of heat and drought on plant performance and symbiosis. In the first experiment, we found that drought and drought-treated microbes, along with their interaction, significantly decreased the biomass of Medicago lupulina plants compared to well-watered microbiomes and conditions. In a second experiment, we then tested how the addition of a well-known microbial mutualist, the rhizobium Sinorhizobium meliloti, affected climate-treated microbiomes’ impact on the M. lupulina. We found that drought-adapted microbiomes negatively impacted legume performance by increasing mortality and reducing leaf number early in life, but that adding rhizobia erased climate treatment effects. Drought can negatively affect legume performance through microbial legacy effects alone, but the addition of rhizobia buffers legumes against climate-mediated variation in the microbiome. In contrast, heat-adapted microbiomes did not differ significantly from control microbiomes in their effects on a legume.
https://doi.org/10.5061/dryad.fj6q57449
Description of the data and file structure
Here is data and analysis from two experiments. In the first experiment, we tested whether drought-treated soil microbes affected legumes in both dry and well-watered soil conditions in a growth chamber. We collected data on the plants in this experiment. In the second experiment, we tested whether heat- and drought-treated soil microbes affected legumes and rhizobia, under well-watered conditions only. In this second experiment, we collected data on the plants and we sequenced the root microbiome of a subset of the plants. This generated microbiome data of fungi, bacteria, and root nodule endophytes.
Files and variables
Used in Experiment 1 and 2
File: Climate_legacy_symbiosis.Rmd
Description: R markdown file containing all data wrangling and analysis.
Experiment 1: Effect of drought-treated microbes on legume performance in drought
File: Condition_dependency_of_drought_microbiome.csv
Description: Data collected on plants in the first experiment. The meaning of empty cells, e.g., absent values, is explained in the description of each column.
Variables
- Plant_ID: Unique plant id.
- Block: Spatial block of plant in the growth chamber
- Position: Spatial position of plant within block in the growth chamber
- Microbes: The conditions soil microbes have been exposed to in the field. DroughtSoil= Droughted field conditions. ControlSoil= Control field conditions
- Water: The water availability conditions of plants in the growth chamber. TerminalDrought= Drought condition. FullWater= Plants were watered as needed.
- Full_Treatment: The combination of "Microbes" and "Water" treatments that plants received.
- Germination_success: Whether plants successfully germinated or not. 1=Germination success, 0=Did not germinate.
- Leaf_num_dec5 to Leaf_num_jan22: Counts of leaves on the plants on a given date (indicated by column name). We only counted leaves that had chlorophyll/green colouration, which indicated active photosynthesis. Counts began in December 2022 and ended in January 2023. Absent values indicate no measurements were taken because the plant was dead, or in the case of counts on January 9th, incomplete sampling.
- Branch_num_jan22: The number of branches on each plant (regardless of leaf colouration). This measurement was done on January 22nd of 2023. Absent values indicate no measurements were taken because the plant was dead.
- Death_JB_jan26: A measure of plant mortality on January 26th 2023, done by Julia Boyle. 0=alive, 1=dead.
- Nodule_num: The number of nodules on a plant. Absent values indicate no measurements were taken because the plant roots/belowground biomass were not in good enough condition to assess or measure.
- Above_biomass_g: The weighed dried aboveground biomass in grams. Absent values mean this part of the plant was not recoverable or intact in full enough to weigh.
- Below_biomass_g: The weighed dried belowground biomass in grams. Absent values mean this part of the plant was not recoverable or intact in full enough to weigh.
Experiment 2: Soil climate legacy effects with and without an additional mutualist
File: Symbiosis_data.csv
Description: Data collected on plants in the second experiment. The meaning of empty cells, e.g., absent values, is explained in the description of each column.
Variables
- Unique_ID: Unique plant ID
- Position: Spatial position of plant within a conetainer rack.
- Rack: Which conetainer rack plant was in, in the growth chamber
- Block: Spatial block of conetainer racks in the growth chamber. Racks were blocked together under light canopies.
- Treatment_full: The full treatment a plant received, in the order of climate treatment, warming array plot in the field, whether microbes were sterilized or not, and whether additional rhizobia was added or not.
- Climate_Treatment: The climate treatment applied to microbes in the field, prior to their use in the experiment. Control= ambient, Drought= rainout structures, Heat= active infrared heaters, Heatwave= Both rainout structures and active infrared heaters
- Heat_applied: Whether heat was applied, 0= heat not applied, 1=heat was applied.
- Drought_applied: Whether drought was applied, 0= drought not applied, 1= drought was applied.
- Array_Plot: The array plot in the field that experienced climate treatments. There were 12, with 3 replicates per climate treatment.
- Microbes: Indicates whether microbes were sterilized or not prior to being added to the plants. Sterile= autoclaved soil, unsterile= fresh biotic soil.
- Added_GFPrhizobia: Whether we added additional rhizobia to the plants in the growth chamber. The Sinorhizobium meliloti rhizobia had an added Green Fluorescence Protein (GFP) inserted into its genome.
- Germination_success: Whether plants successfully germinated in the growth chamber. 1=Germination success, 0=Did not germinate.
- Total_nod_num: Total number of nodules on the plant. Absent values indicate no measurements were taken because the plant roots/belowground biomass were not in good enough condition to assess or measure.
- GFP_nod_num: The number of nodules that showed any degree of fluorescence. Absent values mean we did not record the value for that plant- this data was not used due to our sequencing data.
- Field_nod_num: The number of nodules that did not show any fluorescence. Absent values mean we did not record the value for that plant- this data was not used due to our sequencing data.
- Unifoliate_leaf_june9 to Unifoliate_leaf_june20: Presence (1) or absence (0) of the first leaf to emerge after the cotyledons. The date surveyed (all in 2022) indicated in column name. Absent values indicate the plant did not germinate.
- Leaf_num_june14 to Leaf_num_july25: Counts of total leaves produced by the plant on the date indicated in the column name. All leaves were green and producing chlorophyll during this period, meaning these values are also branch numbers. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the number.
- Leaf_num_alive_aug3 to Leaf_num_alive_aug19: Counts of green/photosynthetically ('alive') active leaves on the plant on the date indicated in the column name. These values are not the same as branch numbers, which represents total number of leaves produced. Absent values indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the number.
- Harvested_aug3 and Harvested_aug15: A 1 means the plant was recently dead and we harvested the plant to still retain as much data as possible on nodules. Date harvested indicated by column name. Absent values mean the plant was not harvested.
- Death_survey_aug12 and Death_survey_aug19: Survey of which plants were dead (1) or alive (absent value) on the date indicated in the column name.
- Branch_number_end_checked: A final count of the number of branches (regardless of leaf colouration) on plants, done after harvest and after drying. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the number, or in some cases that a confident count could not be made.
- Aboveground_biomass_g: Dried and weighed aboveground biomass in grams. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the plant structure.
- Below_ground_biomass_g: Dried and weighed belowground biomass in grams. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the plant structure.
- Trichoderma_june9: Whether there was evidence of fungal contamination (presumed to be Trichoderma sp.) in the conetainer on June 9th. 0=No evidence, 1=evidence.
- Trichoderma_visible_june24: Whether there was evidence of fungal contamination (presumed to be Trichoderma sp.) in the conetainer on June 24th. Absent values=No evidence, 1=evidence.
- Trichoderma_presence: Whether there was ever evidence of fungal contamination (presumed to be Trichoderma sp.) in the conetainer. 0=No evidence, 1=evidence.
- Notes: Any additional notes. Absent values mean there wasn't anything of note to describe.
File: symb_leaf_no_over_time.csv
Description: Branch counts over time, aggregated as means and standard error for plants given microbes from each climate treatment. Only plants given biotic soil are shown, with treatments with and without added rhizobia separate. Values collected using code in R markdown file.
Variables
- Treatment: Climate treatment that microbes received in the field, prior to their addition to plants in the growth chamber
- Date: Date that the branch counts were counted
- Days_after_germination: How many days the survey date is from when plants were planted
- MeanLeaf: The mean number of leaf branches on the plant
- SELeaf:The standard error of the mean number of leaf branches on the plant
- Tissue_counted: Whether the value refers to branch number, or alive leaf count only
- N: Number of plants included in the mean
- Soil: Whether plants received biotic soil with or without rhizobia
File: metadata_symbiosis_plants.tsv
Description: metadata_symbiosis_plants.tsv is metadata for plants that had their root microbes sequenced. The meaning of empty cells, e.g., absent values, is explained in the description of each column.
Variables
- sample-id: Unique sample ID, corresponds to sample ID in Symbiosis_data.csv
- Performance_level_bacteria: Whether this plant belonged in the top 20% or bottom 20% of performing plants in the rarefied bacterial dataset, depending on aboveground biomass. Differences in sequencing depth and subsequent rarefaction means not all plants will be in the same performance category for bacteria and fungal datasets. Absent values indicate the plant was not in the top or bottom 20% of performance categories.
- Performance_level_fungi: Whether this plant belonged in the top 20% or bottom 20% of performing plants in the rarefied fungal dataset, depending on aboveground biomass. Differences in sequencing depth and subsequent rarefaction means not all plants will be in the same performance category for bacteria and fungal datasets. Absent values indicate the plant was not in the top or bottom 20% of performance categories.
- Rack: Which conetainer rack plant was in, in the growth chamber
- Block: Spatial block of conetainer racks in the growth chamber. Racks were blocked together under light canopies.
- Treatment_full: The full treatment a plant received, in the order of climate treatment, warming array plot in the field, whether microbes were sterilized or not, and whether additional rhizobia was added or not.
- Climate_Treatment: The climate treatment applied to microbes in the field, prior to their use in the experiment. Control= ambient, Drought= rainout structures, Heat= active infrared heaters, Heatwave= Both rainout structures and active infrared heaters
- Heat_applied: Whether heat was applied, 0= heat not applied, 1=heat was applied.
- Drought_applied: Whether drought was applied, 0= drought not applied, 1= drought was applied.
- Array_Plot: The array plot in the field that experienced climate treatments. There were 12, with 3 replicates per climate treatment.
- Microbes: Indicates whether microbes were sterilized or not prior to being added to the plants. Sterile= autoclaved soil, unsterile= fresh biotic soil.
- Added_GFPrhizobia: Whether we added additional rhizobia to the plants in the growth chamber. The Sinorhizobium meliloti rhizobia had an added Green Fluorescence Protein (GFP) inserted into its genome.
- Leaf_num_alive_aug19: Counts of green/photosynthetically ('alive') active leaves on the plant on August 19th indicated in the column name. These values are not the same as branch numbers, which represents total number of leaves produced. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the number.
- Death_survey_aug19: Survey of which plants were dead (1) or alive (no value) on August 19th.
- Aboveground_biomass_g: Dried and weighed aboveground biomass in grams. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the plant structure.
- Below_ground_biomass_g: Dried and weighed belowground biomass in grams. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the plant structure.
- Trichoderma_presence: Whether there was ever evidence of fungal contamination (presumed to be Trichoderma sp.) in the conetainer. 0=No evidence, 1=evidence.
- Total_nod_num: Total number of nodules on the plant. Absent values indicate no measurements were taken because the plant roots/belowground biomass were not in good enough condition to assess or measure.
File: merged_metadata_symbiosis_plants.tsv
Description: The merged_metadata_symbiosis_plants.tsv contains microbial metadata from the soil, rhizosphere and nodule samples all in one file. Absent values mean that the column was not relevant for that sample (e.g. growth chamber plant metadata for field-collected soil sample sequences), or that the data was not possible to collect due to plant death.
Variables
-
sample-id: Unique id for the samples. Plant associated samples have a number 1-720 to represent the metadata for each sample. Samples with BACRHIZ prefix are for representing bacteria in the rhizosphere. Samples with FUNRHIZ prefix are for representing fungi in the rhizosphere. Samples that are just a number represent the metadata for that plant, and are for representing nodule data. Numbered samples like "1-1" denote warming array plot#-subsample# and represent soil samples. While all 1-720 plants' metadata are included, not all plants have corresponding sequencing data associated with them.
-
Location: Where the sample metadata is associated with. Can be rhizosphere, nodule, or soil.
Rack: Which conetainer rack plant was in, in the growth chamber
-
Block: Spatial block of conetainer racks in the growth chamber. Racks were blocked together under light canopies.
-
Treatment_full: The full treatment a plant received, in the order of climate treatment, warming array plot in the field, whether microbes were sterilized or not, and whether additional rhizobia was added or not.
-
Climate_Treatment: The climate treatment applied to microbes in the field, prior to their use in the experiment. Control= ambient, Drought= rainout structures, Heat= active infrared heaters, Heatwave= Both rainout structures and active infrared heaters
-
Heat_applied: Whether heat was applied, 0= heat not applied, 1=heat was applied.
-
Drought_applied: Whether drought was applied, 0= drought not applied, 1= drought was applied.
-
Array_Plot: The array plot in the field that experienced climate treatments. There were 12, with 3 replicates per climate treatment.
-
Microbes: Indicates whether microbes were sterilized or not prior to being added to the plants. Sterile= autoclaved soil, unsterile= fresh biotic soil.
-
Added_GFPrhizobia: Whether we added additional rhizobia to the plants in the growth chamber. The Sinorhizobium meliloti rhizobia had an added Green Fluorescence Protein (GFP) inserted into its genome.
-
Leaf_num_alive_aug19: Counts of green/photosynthetically ('alive') active leaves on the plant on August 19th indicated in the column name. These values are not the same as branch numbers, which represents total number of leaves produced. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the number.
-
Death_survey_aug19: Survey of which plants were dead (1) or alive (no value) on August 19th.
-
Aboveground_biomass_g: Dried and weighed aboveground biomass in grams. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the plant structure.
-
Below_ground_biomass_g: Dried and weighed belowground biomass in grams. Absent values usually indicate the plant did not germinate, or sometimes that the plant died/withered such that we could not assess the plant structure.
-
Trichoderma_presence: Whether there was ever evidence of fungal contamination (presumed to be Trichoderma sp.) in the conetainer. 0=No evidence, 1=evidence.
-
Total_nod_num: Total number of nodules on the plant. Absent values indicate no measurements were taken because the plant roots/belowground biomass were not in good enough condition to assess or measure.
File: rooted-tree-bacrhiz.qza
Description: Sequence tree made in QIIME2 based on 16S sequences from the rhizosphere.
File: rooted-tree-funrhiz.qza
Description: Sequence tree made in QIIME2 based on ITS sequences from the rhizosphere. Not used for any analyses because inference based on ITS is too rough and may be erroneous.
File: merged-rooted-tree-fungi.qza
Description: Sequence tree made in QIIME2 based on ITS sequences from the rhizosphere and soil. Not used for any analyses because inference based on ITS is too rough and may be erroneous.
File: rooted-tree-nod.qza
Description: Sequence tree made in QIIME2 based on 16S sequences from the nodules.
File: merged-unrooted-tree-bacteria.qza
Description: Sequence tree made in QIIME2 based on 16S sequences from the rhizosphere and soil.
File: table-nod-noplant.qza
Description: Feature table of bacterial samples from nodules, where rows are ASVs and columns are sample IDs (corresponding to metadata 'sample-id' columns). This is the resulting table after using QIIME2 to remove chloroplasts and mitochondria.
File: table-funrhiz-noplant.qza
Description: Feature table of fungal samples from the rhizosphere, where rows are ASVs and columns are sample IDs (corresponding to metadata 'sample-id' columns). This is the resulting table after using QIIME2 to remove chloroplasts and mitochondria, and filter out ASVs that had fewer than 10 reads across all samples.
File: table-bacrhiz-noplant.qza
Description: Feature table of bacterial samples from the rhizosphere, where rows are ASVs and columns are sample IDs (corresponding to metadata 'sample-id' columns). This is the resulting table after using QIIME2 to remove chloroplasts and mitochondria, and filter out ASVs that had fewer than 10 reads across all samples.
File: field2funrhizo_merged_table.qza
Description: Feature table of fungal samples from the rhizosphere and soil, where rows are ASVs and columns are sample IDs (corresponding to metadata 'sample-id' columns). This is the resulting table after using QIIME2 to remove chloroplasts and mitochondria, and filter out ASVs that had fewer than 10 reads across all samples.
File: field2nod_merged_table.qza
Description: Feature table of bacterial samples from the nodules, rhizosphere, and soil, where rows are ASVs and columns are sample IDs (corresponding to metadata 'sample-id' columns). This is the resulting table after using QIIME2 to remove chloroplasts and mitochondria, and filter out ASVs that had fewer than 10 reads across all samples.
File: taxonomy-bacrhiz.qza
Description: Taxonomy file linking ASVs in rhizosphere samples with assigned taxonomy (Kingdom; Phylum; Order; Class; Family; Genus; Species) using GreenGenes classifying database. Created using QIIME2.
File: taxonomy-funrhiz.qza
Description: Taxonomy file linking ASVs in rhizosphere samples with assigned taxonomy (Kingdom; Phylum; Order; Class; Family; Genus; Species) using UNITE9 fungi classifying database. Created using QIIME2.
File: merged-taxonomy-fungi.qza
Description: Taxonomy file linking ASVs in rhizosphere and soil samples with assigned taxonomy (Kingdom; Phylum; Order; Class; Family; Genus; Species) using UNITE9 fungi classifying database. Created using QIIME2.
File: merged-taxonomy-bacteria.qza
Description: Taxonomy file linking ASVs in rhizosphere and soil samples with assigned taxonomy (Kingdom; Phylum; Order; Class; Family; Genus; Species) using GreenGenes classifying database. Created using QIIME2.
File: taxonomy-nod.qza
Description: Taxonomy file linking ASVs in nodule samples with assigned taxonomy (Kingdom; Phylum; Order; Class; Family; Genus; Species) using GreenGenes classifying database. Created using QIIME2.
Code/software
I used R (version 4.2.0) and R Studio (version 2024.09.0 Build 375) to run analysis. As such, there is a .rmd (R markdown) file that contains all code required to run the analyses and create figures.
Files with .qza are outputs from QIIME2, and are used as-is in the R code provided.
Access information
Raw sequence available at NCBI's Short Read Archive under BioProject: PRJNA1332817 https://dataview.ncbi.nlm.nih.gov/object/PRJNA1332817.
Other publicly accessible locations of the data:
Study system
The temperature-free-air-controlled enhancement (T-FACE) experiment is located in an old field at Koffler Scientific Reserve (KSR, www.ksr.utoronto.ca) in Ontario, Canada (44°01'48”N, 79°32'01”W). The experimental design, treatment effectiveness, and soil microbiome analysis are fully described in Boyle et al. (2024; https://doi.org/10.1101/2023.08.04.551981), but we give a brief overview here. Plots in the T-FACE experiment grew only white spruce, Picea glauca, and were either heated, droughted, heated and droughted, or ambient (3 plots/treatment). Boyle et al. (2024) applied treatments during the growing seasons of 2020 and 2021, with rainout structures present for 8 months and heaters activated for 9 months. During active treatment, the mean soil temperature of heated plots was 3.7 ℃ or 3.6 ℃ hotter than un-heated plots in 2020 and 2021, respectively. In 2020, the mean soil volumetric water content (VWC) during droughts was 0.28 (m3/m3) in non-drought plots and 0.25 (m3/m3) in drought plots, and in 2021, mean VWC was 0.26 (m3/m3) in non-drought plots and 0.21 (m3/m3) in drought plots. We collected sifted soil (passed through a 4.75 mm metal sieve) from each plot on June 5th, 2022, sterilizing tools between each collection. We kept soil bags stored slightly open in the dark at 4 ℃ until application to the plants.
We used seeds of a single genotype of Medicago lupulina, an annual legume that forms indeterminate root nodules with rhizobia in the genus Sinorhizobium. Seeds were collected from KSR and selfed for two generations in the University of Toronto greenhouses. Medicago lupulina is naturalized at KSR near the experimental warming array, and its growing season peaks in June, making the timing of our soil collection and plant system ecologically relevant.
Experiment 1: Effect of drought-treated microbes on legume performance in drought
We tested whether drought-treated microbes affected legumes in both dry and well-watered soil conditions. We implemented a 2×2 factorial design with plants receiving live drought soil or live control soil, and experiencing a terminal drought or well-watered conditions, with 30 replicates per treatment, for a total of 120 plants. We pooled soil within treatment types to generate inocula. We began the gradual terminal drought treatment with 2 weeks of well-watered conditions, followed by 2 weeks of ⅔ volume water, then 3 weeks of ⅓ volume water, and finally no further water. Ten weeks post-planting, just over 80% of terminal drought plants were dead and we ended the experiment. We counted branch number through time and measured the nodule number and dry weight of above- and below- ground biomass of each plant at the end of the experiment.
Experiment 2: Soil climate legacy effects with and without an additional mutualist
We next tested whether heat- and drought-treated microbes affected legumes and rhizobia, under well-watered conditions only. We factorially tested the four climate treatments, each with 3 replicate plots from the warming array, and with sterilized or fresh soil. We used sterilized soil to test for abiotic differences in soil from different field plots and climate treatments. The field soil resulted in poor nodulation in Experiment 1. In Experiment 2, we also inoculated half of the plants with Sinorhizobium meliloti 1021-71 tagged with green fluorescent protein (GFP) (courtesy of Daniel Gage, (Gage et al. 1996) to determine whether sufficient rhizobia buffer legumes against the negative effects of drought-treated microbiomes. In total, we had 48 treatments replicated 15 times for a total of 720 plants. We also included an additional 15 sterile control plants with no additional soil.
Over almost 12 weeks, we surveyed branch number and mortality. For all plants, we counted the number of nodules then dried and weighed above and below ground biomass. At the end of the experiment, we randomly sampled five live plants per treatment receiving biotic soil to sequence the microbial communities in their rhizosphere and nodules (n = 120), prior to drying.
We extracted DNA from the rhizosphere and nodule microbes, then sequenced the fungal ITS region and bacterial 16S V4 region. We used Quantitative Insights Into Microbial Ecology 2 (QIIME2) v.2022.2 (Bolyen et al. 2019). We trimmed the sequences for quality. For 16S reads we trimmed the left 20 bp from forward and reverse reads and truncated reads at 240 bp, while for ITS reads we trimmed the left 25 bp for forward sequences, trimmed 20 bp for reverse sequences, and truncated at 240 bp for both forward and reverse reads. We denoised the sequences with DADA2 (Callahan et al. 2016) into amplicon sequence variants (ASVs). We removed ASVs that had fewer than 10 reads across all samples, and assigned taxonomy using the ‘sklearn’ feature classifier (Pedregosa et al. 2011) with the 2021 Greengenes 16S V4 region reference for bacteria (McDonald et al. 2012), and UNITE version 9.0 with dynamic clustering of global and 97% singletons for fungi (Abarenkov et al. 2022). After assigning taxonomy, the bacterial rhizosphere had 8,058 ASVs with 3,055,989 reads and a median 26,923 reads/sample. The fungal rhizosphere had 1,016 ASVs and 2,902,562 total reads, and a median of 25,278 reads/sample. Nodule samples had 269 ASVs with 1,842,801 total reads, and median 28,518 reads/sample. Then we filtered out reads assigned as chloroplasts and mitochondria to remove plant DNA. After filtering, the bacterial rhizosphere retained 8,019 ASVs and 3,043,867 total reads, the fungal rhizosphere retained all ASVs and reads, and the nodule data retained 237 ASVs and 1,114,982 reads. We rarefied bacterial and fungal rhizosphere samples to 15,000 reads and 12,400 reads respectively, and rarefied nodule bacteria to 1000 reads. After rarefaction, we had 113 samples for the bacterial rhizosphere, 115 samples for the fungal rhizosphere, and 58 samples for nodules. Finally, we constructed phylogenies for bacterial reads using QIIME2’s MAFFT (Katoh & Standley 2013) and FastTree (Price et al. 2010) functions to obtain rooted trees.
