Plant diversity surely determines arthropod diversity, but only moderate correlations between arthropod and plant species richness had been observed until Basset et al. (Science, 338, 2012 and 1481) finally undertook an unprecedentedly comprehensive sampling of a tropical forest and demonstrated that plant species richness could indeed accurately predict arthropod species richness. We now require a high-throughput pipeline to operationalize this result so that we can (i) test competing explanations for tropical arthropod megadiversity, (ii) improve estimates of global eukaryotic species diversity, and (iii) use plant and arthropod communities as efficient proxies for each other, thus improving the efficiency of conservation planning and of detecting forest degradation and recovery. We therefore applied metabarcoding to Malaise-trap samples across two tropical landscapes in China. We demonstrate that plant species richness can accurately predict arthropod (mostly insect) species richness and that plant and insect community compositions are highly correlated, even in landscapes that are large, heterogeneous and anthropogenically modified. Finally, we review how metabarcoding makes feasible highly replicated tests of the major competing explanations for tropical megadiversity.
Raw Short Reads of Insects in Mengsong
The data is for insect samples from Malaise traps in Mengsong, Xishuangbanna, Yunnan Province, China. Samples were prepared by using one leg from all specimens equal to or larger than a large fly (~5 mm length) and whole bodies of everything smaller. Following the protocol of Ji et al. (2013), samples were homogenized, and DNA was extracted, quality-checked, PCR-amplified with indexed, degenerate primers for the standard mtCOI barcode region, and gel-purified. The PCR products were A-amplicon-sequenced on a Roche GS FLX at the Kunming Institute of Zoology. The 28 Mengsong samples were sequenced on one whole run (four 1/4 regions, November-December 2010: wet season) and two 1/4 regions (May-June 2011: dry season), producing 519 865 and 253 025 raw reads, respectively.
submission_fastq_MS.tar.gz
Raw Short Reads of Insects in Yinggeling National Nature Reserve
The data is for insect samples from Malaise traps in Yinggeling National Nature Reserve, Hainan Province, China. Samples were prepared by using one leg from all specimens equal to or larger than a large fly (~5 mm length) and whole bodies of everything smaller. Following the protocol of Ji et al. (2013), samples were homogenized, and DNA was extracted, quality-checked, PCR-amplified with indexed, degenerate primers for the standard mtCOI barcode region, and gel-purified. The PCR products were A-amplicon-sequenced on a Roche GS FLX at the Kunming Institute of Zoology. The 21 Yinggeling samples were sequenced on two 1/8 regions (one 1/8 region shared with other samples), producing 40 261 raw reads.
submission_fastq_YGL.tar.gz
454 sequencer datasets of Mengsong
The datasets are output of 454 sequencer, and input of the bioinformatic analyses.
454sequencer_dataset_MS.tar.gz
454 sequencer datasets of Yinggeling
The datasets are output of 454 sequencer, and input of the bioinformatic analyses.
454sequencer_dataset_YGL.tar.gz
Bioinformatic scripts of Mengsong
Command history (command_history_mengsong.txt) and related files, e.g. map files, reference data, programs written by perl language, etc.
bioinformatic_script_MS.zip
Bioinformatic scripts of Yinggeling
Command history (command_history_yinggeling.txt) and related files, e.g. map files, reference data, programs written by perl language, etc.
bioinformatic_script_YGL.zip
R scripts
The file includes all R scripts used, and is of R Markdown format. You may need to open it in RStudio.
Plants_accurately_predict_insects.Rmd
Environmental variables in Mengsong
The data includes environmental variables in Mengsong.
The row names are IDs of survey plots (i.e. 7, which corresponds to PLOT007 in the data of raw short reads). Abbreviations for column names are: HAB = habitat types (RF: regenerating forest, OL: open lands, MF: mature forest), UTM_E and UTM_N = geographic coordinates of each survey plot, F_NR = the survey plots inside the forests of Bulong Nature Reserve? (TRUE: yes, FALSE: no)
MSenv.csv
Environmental variables in Yinggeling
The data includes environmental variables in Yinggeling.
The row names are IDs of survey plots. Abbreviations for column names are: UTM_E and UTM_N = geographic coordinates of each survey plot, Time = the time when survey plots were established and trees were surveyed (2009 or 2011).
YGLenv.csv
Plant community composition in Mengsong
Plant community composition of each survey plot in Mengsong.
The rows are survey plots and columns are plant species. This is an input file for R.
MS_Plant_output.csv
Tree community composition in Yinggeling
Tree community composition of each survey plot in Yinggeling.
The rows are survey plots and columns are plant species. This is an input file for R.
YGL_treedensity.csv
Tree community composition of survey plots in the forests of Bulong Nature Reserve
The dataset is a subset of plant community composition data in Mengsong. For cross-site predictions, we used only the Mengsong plots (n = 16) located within the forest of Bulong Nature Reserve (~60 km2) and only included trees > 5cm DBH in each plot, to maximum comparability between the two landscapes.
The rows are survey plots and columns are tree species. This is an input file for R.
MS_Tree_output_5cm.csv
The Operational Taxonomic Unit (OTU) representative sequences in Mengsong
After denoising and deconvoluting, we clustered the reads into 97%-similarity Operational Taxonomic Units (OTUs), which should approximate or somewhat underestimate biological species. We further assigned taxonomic information to the OTUs. The data here shows the OTU representative sequences in Mengsong.
MSall_CROP97_DNACLUST_otus_r0.cluster.fasta
The Operational Taxonomic Unit (OTU) representative sequences in Yinggeling
After denoising and deconvoluting, we clustered the reads into 97%-similarity Operational Taxonomic Units (OTUs), which should approximate or somewhat underestimate biological species. We further assigned taxonomic information to the OTUs. The data here shows the OTU representative sequences in Yinggeling.
YGL_CROP97_otus_z5k.cluster.fasta
The Operational Taxonomic Unit (OTU) table with assigned taxonomies in Mengsong
Arthropod community composition of each survey plot in both seasons of Mengsong.
MS1 = November-December 2010 (the end of wet season),
MS2 = May-June 2011 (the end of dry season).
The rows are OTUs, and columns are survey plots together with taxonomies of each OTU. The formats of the dataset are slightly modified for input to R.
MBCar_MS.rar
The Operational Taxonomic Unit (OTU) table with assigned taxonomies in Yinggeling
Arthropod community composition of each survey plot in Yinggeling.
The rows are OTUs, and columns are survey plots together with taxonomies of each OTU. The formats of the dataset are slightly modified for input to R.
MBCar_YGL.csv