Skip to main content

Data from: A test of community assembly rules using foliar endophytes from a tropical forest canopy

Cite this dataset

Donald, Julian et al. (2019). Data from: A test of community assembly rules using foliar endophytes from a tropical forest canopy [Dataset]. Dryad.


Community assembly theory assumes that ecological communities are spatially delimited into patches. Within these patches, coexistence results from environmental filtering, competition, and immigration. Truly delineated communities exist in laboratory studies of microbial cultures in Petri dishes, yet empirical tests conducted in continuous environments often use patches defined by convention as opposed to realised boundaries. Here we perform a test of ecological community assembly rules using foliar endophyte communities from a tropical rainforest, where leaves are considered as patches for both fungal and bacterial communities. We determined the diversity of fungal and bacterial endophytes using environmental DNA sequencing of 365 top-canopy leaves, collected from 38 host trees belonging to 22 different species across a 4-hectare research plot. Three leaves were collected from three or more branches within each tree crown. We tested the effect of host tree species and their level of phylogenetic relatedness on community composition as well as the contribution of geographic distance between leaves to endophyte community diversity.  Endophyte diversity significantly differed across host tree species, as did community composition. Within certain endophytic orders (Xylariales, Rhizobiales) species assemblages significantly differed across host tree species, but this trend was weaker or non-existent in other orders known to contain pathogens and saprotrophs (Polyporales, Solirubrobacterales). Phylogenetically related host tree species displayed more similar endophyte communities than expected by chance, but geographically close trees did not. Consistent with the finding of host-specificity, nearby leaves tended to host more similar communities than distantly positioned ones. These findings demonstrate that foliar endophytes are structured by dispersal across small spatial scales, but at the scale of the canopy they display patterns of neutral filtering, with only a small part of variation described by host tree differences. Endophyte communities thus act as a model system in evoking the rules predicted by theoretical community ecology.


Data collection took place at the Nouragues Ecological Research Station in French Guiana, within an area of pristine lowland tropical rainforest protected since 1996. This area is characterised by a high level of floral diversity, with over 1700 recorded angiosperm species (Sabatier, 1990, van der Meer and Bongers, 1996). The reserve experiences mid-day temperatures of 26 °C and rainfall of 2861 mm per year (Réjou-Méchain et al. 2015). It experiences two dry seasons, a longer dry season between September and mid-November, and a shorter dry season in March

Fieldwork described here took place in August 2017. Sampling was conducted in a 4-ha permanent sampling plot adjacent to the Pararé research station (4°02’ N, 52°41’ W) located on the Arataye tributary of the Approuague River. Access to the canopy was made possible by the ‘Canopy Operational Permanent Access System’ (COPAS), consisting of a trio of 45 m high pylons set up at the vertices of an equilateral triangle with edges 180 m in length, equipped with a mechanically operated harness which allows for the displacement of an individual anywhere within the upper canopy across the research plot (Fig. 1).

Sample design and preparation

Leaves were collected from the upper crown of 38 trees, of 22 different species, selected to represent a phylogenetically diverse range of species (Fig. 2 & S.I. 1). A stratified sampling design was conducted for each tree: three leaves were collected from each of a minimum of three branches positioned across the crown. For those trees with a large crown, 4 or 5 branches were selected accordingly, so that sampling accurately represented the total spread of the crown. In total this resulted in 365 leaves

Leaves were immediately transported to a field laboratory where they were surface sterilised in order to ensure that sampling consisted of only endophytes. This sterilisation included several steps, first mechanical cleaning and scraping with toothbrushes. Secondly, following from the protocol detailed by Arnold et al. (2000), leaves were soaked in diluted bleach (0,525%) for 2 min before transfer into ethanol (70%) for a further 2 min. This method is the standard in eDNA extraction for endophyte research (e.g. Izuno et al. 2016, Haruna et al. 2018), proving efficient in removing the maximum amount of epiphytic fungi and bacteria from the leaf surface. Finally, leaves were dried with tissue paper, and stored in Ziploc bags containing silica gel before transfer to the lab for DNA extraction.

DNA Isolation, PCR amplification and sequencing

Two 0.5 cm2 pieces of each leaf were placed into a 96 well plate, and a 3mm aluminium bead was added to each well. To facilitate the grinding step, the sealed plates were placed in a -80 °C freezer for at least an hour before grinding using a Tissue Lyser (Qiagen, Germany) for 45 sec at 30Hz, followed by a centrifugation for 30 seconds at 6000 x g. DNA was extracted from samples using a NucleoSpin Plant II (96) kit (Macherey-Nagel, Düren, Germany) according to the manufacturer protocol modifying only cell lysis time to 1 h and elution volume to 100µL

PCRs were performed using two primers; ITS1 nuclear rDNA primers to target fungi (Fwd: ITS5 GGAAGTAAAAGTCGTAACAAGG (Epp et al. 2012) and a modified version of Rev: 5.8S_Fungi CAAGAGATCCGTTGTTGAAAGTK, Taberlet et al. 2018), and 16S rDNA (V5-V6) primers to target bacteria (Bact01 primers - Fwd: GGATTAGATACCCTGGTAGT and Rev: CACGACACGAGCTGACG (Fliegerova et al. 2014)). To discriminate samples after sequencing, forward and reverse primers were synthetized with a combination of two-different 8-nucleotide tags per sample, following a double-indexing strategy (Binladen et al. 2007). As such, each PCR was amplified with a unique combination of tagged primers. Each PCR reaction was performed in a total volume of 20 µl and comprised 10 µl of AmpliTaq GoldMaster Mix (Life Technologies, Carlsbad, CA, USA), 5.84 µL of nuclease-free water AmbionTM (Invitrogen, Waltham, Massachusetts, USA), 0.25 µM of each primer, 3.2µg of BSA (Roche Diagnostic, Basel, Switzerland), and 2 µl of DNA template. Two PCRs per sample were performed under the following conditions: Polymerase reactivation 10 min at 95°C, followed by 35 (fungi) or 30 (bacteria) cycles of 30 sec at 95°C, 30 sec at 55°C (fungi) or 57°C (bacteria) and 1 min at 72°C; followed by a final step of 7 min at 72°C. Two wells in each plate were filled with water, to act as PCR negative controls. Eight wells positioned randomly across each plate were left empty (with no PCR products) to act as sequencing controls (non-used tag combinations)

Amplicons were pooled and libraries were prepared with the TruSeqNano PCR free Illumina kit and were sequenced on Paired-end (2X250 bp) in the Illumina Miseq platform (Illumina, San Diego, CA, USA) at the INRA Genotoul-GetPlaGe core facility (Toulouse, France) using the Paired-end MiSeq Reagent Kit V3 (Illumina, San Diego, CA, USA), following the manufacturer’s instructions

Sequence data curation

The generated reads were subject to a data-curation pipeline using scripts in R (R Development Core Team, 2013), and the OBITools package (Boyer et al. 2016). Firstly, ‘Illuminapairedend’ was used for paired-end read assembly where an exact alignment algorithm assigns a quality score for each nucleotide position, generating a score for each read based on the number of mismatches, followed by ‘ngsfilter’ which removes primer and tag sequences and assigns reads to samples. Here, reads with any primers with more than two mismatches, and any tags with 1 or more mismatches are removed. ‘obiuniq’ then dereplicates only those sequences which are strictly identical. ‘obigreb’ removes sequences of low quality (containing Ns or with a pair alignment score below 50). Finally sequences with only one read across the global dataset (singletons) were removed, since these represent potential sequencing artefacts.‘sumaclust’ was then used to generate Operational Taxonomic Units (OTUs), where pairwise sequence dissimilarity using the raw number of mismatches was calculated, and where those sequences ³ 97% or more were assigned to the same OTU, a threshold commonly applied to delimitate proxies for during molecular analysis of fungi in particular (Nilsson et al., 2008; Coissac 2012). For each OTU, the sequence with the highest number of reads in each cluster was assigned as the seed sequence, used for subsequent taxonomic assignment.

Taxonomic assignment of OTUs was conducted using reference databases with the ‘ecotag’ function, which uses global alignment of sequences against full-length references. For fungi the reference database was obtained by running an in-silico PCR with the ecoPCR program (Ficetola et al., 2010) on Genbank (release 197; using the primer pairs employed here. Taxonomic assignment yielding the highest similarity score was kept, with similarity scores for fungi ranging from 0.5 to 1 with sequences in ncbi. For Bacteria, taxonomic assignment was conducted using the SILVA database (Quast et al. 2012), with similarity parameters kept at 0.97, higher than fungi since bacterial sequence databases are much more complete than for fungi (Truong et al., 2017). This approach allowed for the elimination of a certain number of chimeras which are likely to have formed during PCR. In particular, taxonomic identification facilitated later analysis, by distinguishing between orders known to contain OTUs corresponding to endophytic fungi (Xylariales, Capnodiales, Chaetothyriales, Phyllachorales, Diaporthales, Sordariales, Pleosporales, Hypocreales) against orders which are known to contain generalist saprotrophs (Polyporales, Eurotiales, Corticiales). Similarly, bacterial endophyte orders which have strict relationships with plants (e.g. Rhizobiales, Acetobacterales, Sphingomonadales) could be compared with orders known to contain more generalist bacteria (e.g. Solirubrobacterales, Pseudonocardiales, Acidobacteriales).

Usage notes

Here we provide the sequence files post datacuration using the Obitools pipeline (ITSpoolTOT for fungi, SECIL3_NoIndex_R1R2_ngsfilt_n50_ali_uniq_noS_cl97 for bacteria).  We also provide the sample file which associates this information with the leaves these OTUs were generated from (Sample_Info_blanks4 for fungi, Sample_Info_blanks3 for bacteria)


Labex CEBA, Award: ANR-10-LABX-25-01

Labex Tulip, Award: ANR-10-LABX-0041

Agence Nationale de la Recherche, Award: ANR-11-INBS-0001

Agence Nationale de la Recherche, Award: ANR-15-CE21-0016

Swiss National Science Foundation, Award: SNF N° 310030E-164289