File Diet_GNP2016_trnL_raw_data.fasta This fasta file contains unfiltered sequences amplified by g (5'-GGGCAATCCTGAGCCAA-3') and h (5'-CCATTGAGTCTCTGCACCTATC-3') primers The sequences have been produced by the Illumina technology (HiSeq platform, 1x150bp paired-end). First filtering steps were performed using the OBITOOLS software (Boyer et al. 2016). Direct and reverse reads corresponding to the same sequence were aligned and merged thanks to the IlluminaPairedEnd program. Only merged sequences with a high alignment quality score were retained (>=40). Each merged sequence was assigned to its original sample using the tags information previously added to primers thanks to the ngsfilter program. Only sequences containing both primers (with a maximum of 2 mismatches per primer) and exact tag sequences were selected. Strictly identical sequences were merged together. Data collection: Johan Pansu, Arjun Potter, Justine Atkins, Robert Pringle. Johan Pansu, Arjun Potter, Justine Atkins, Robert Pringle collected fecal samples. Johan Pansu and Arjun Potter performed DNA extractions. Johan Pansu and Arjun Potter performed DNA amplifications. Sequencing was performed by the Lewis Singler Institute for Integrative Genomics at Princeton University (Princeton, NJ, USA) Johan Pansu performed data filtering. Contact author: Johan Pansu (johan.pansu@gmail.com) The header of each sequence contains: (i) the ID for the sequence (ii) the total number of occurrence of the sequence in the dataset (count) (iii) the number of occurrence in the different PCRs (merged_sample) Code for identifying the PCR: (i) Digits 1-11 correspond to the sample name (ii) The last digit corresponds to the PCR replicate (a, b or c) (v) PCRCx : PCR control x (vi) EXTCx : extraction control x