README for data files for: Large herbivore nemabiomes: patterns of diversity and sharing Corresponding authors: Georgia Titcomb (georgiatitcomb@gmail.com) and Robert Pringle (rmpringle@princeton.edu) Data Contents: *A. Folder: "raw_sequence_data_and_filtering_steps.zip" This folder contains raw sequence data and files needed to obtain the read count dataset (item B below) 1) two raw fastq files from nemabiome sequencing (forward and reverse reads) 2) an ngsfilter file to demultiplex data 3) a script to demultiplex data and inter mOTUs 4) an R script to subset the data to Mpala herbivores ------------------------------------------------------ *B. A sample-by-mOTU table containing counts of sequence reads. This table was created by running the scripts provided in item A. 5) "raw_motu_table_mpala_nematodes.csv" ------------------------------------------------------ *C. Two relative read abundance (RRA) tables contining the proportion of reads comprised by each mOTU for each sample. Both of these tables can be recreated from the table named raw_motu_table_mpala_nematodes.csv (item B above) using the script: 1_clean_raw_motu_and_create_rra_tables (item G below), and also available at https://github.com/gtitcomb/nemabiome_herbivores/tree/master/scripts 6) "RRA_table_2pct_rarefied.csv" This table was created using a 2% RRA filtering threshold, in which low RRA values were mutated to zero. This threshold was used in the main text, and using this table will replicate those results. 7) "RRA_table_02pct_rarefied.csv" This table was created using a 0.2% RRA threshold. This threshold was used for Appendix III, and using this table will replicate those results. ------------------------------------------------------ *D. A host metadata table that contains all sample information and values analyzed in the paper. 8) "host_metadata.csv" ------------------------------------------------------ *E. A pruned mammal tree, which is used in scripts 2, 4, 5, and 6 found in item G below (and in the Github repository). 9) "new_mammal_tree_pruned.newick" ------------------------------------------------------ *F. Two tables containing mOTU taxonomic information for both the 2% and 0.2% datasets. These files were created using the assignTaxonomy() function from the dada2 package. 10) "nem_taxa_table_2pct.csv" For use with the 2% RRA filtering dataset. 11) "nem_taxa_table_02pct.csv" For use with the 0.2% RRA filtering dataset. ------------------------------------------------------ *G. Folder: "analysis_scripts.zip" 12) six scripts to conduct filtering and analyses