Molecular systematics, species concepts and myrmecophytism in Cecropia (Cecropieae: Urticaceae): Insights from restriction-site associated DNA Authors: Erin L. Treiber, Paul-Camilo Zalamea, María Fernanda Torres, Santiago Madriñán, and George D. Weiblen Contact: treib020@umn.edu Software used: pyrad 3.0.66 (https://github.com/dereneaton/pyrad) Methods: We examined 47 collections representing 31 Cecropia species and four other members of the Cecropieae tribe (Coussapoa, Musanga, Myrianthus, and Pourouma. Silica dried material collected in the field was used for DNA extractions, except for one sample for which we only had herbarium material. DNA was extracted using a modified CTAB method (Doyle and Doyle 1987) with at 2% CTAB buffer. Samples were sent to Floragenex Inc. (Eugene, OR) for RAD library preparation and sequencing. Libraries were prepared using the PstI restriction enzyme following the methods of Baird et al. (2008). The library was created from 95 pooled and barcoded samples sequenced on an Illumina Hi Seq 2000 to generate 100bp single end reads. Samples were combined for each collection when demultiplexing the library.Sequences were demultiplexed using ea-utils (Aronesty 2011) with default settings, which allowed for one mismatch in the barcode sequence. The remaining steps of quality filtering and assembly of sequences into de novo loci were done using pyRAD v. 3.0.63. Phylogenetic analyses - Maximum likelihood analyses were performed on each assembled data set using RAxML version 8.2.4 (Stamatakis 2014) on the CIPRES Science Gateway (Miller et al. 2010). Bootstrap support was estimated from 300 replicate searches from random starting trees run using the GTR+Γ model of nucleotide substitution model. Test for Introgression - We used pyRAD v. 3.0.63 to calculate the D-statistic using 1000 bootstrap replicate and significance was assessed with a P-value less than 0.01 after Bonferroni correction for multiple testing. Data Set and Processing: For phylogenetic analyses, multiple matrices were run to explore the affect of parameters on the final phylogenetic tree. The included file "pyRAD_runinformation.csv" outlines which of the data sets (.phy) corresponds to the resulting phylogenetic trees (.result) files in the supplemental files. Each of the values and column headings match parameters entered into pyrad. File names below with corresponding parameters for files (.phy above) run for phylogenetic analysis (RAxML) using CIPRES portal. More information on the pipeline used can be found at https://github.com/dereneaton/pyrad. All data sets were created in 2017. file,min depth for base calling ,Nqual,clustering threshold,minCov,maxShared Cecropia_1.phy,5,5,0.82,5,5 Cecropia_2.phy,5,5,0.9,5,5 Cecropia_3.phy,5,5,0.98,5,5 Cecropia_4.phy,15,5,0.82,15,5 Cecropia_5.phy,15,5,0.9,15,5 Cecropia_6.phy,15,5,0.98,15,5 The data matrix used for phylogenetic analyses can be reduced down to SNPs and use to do tests of introgression between species/samples in the data set. The following are example input and output files for the dstatistic test done using ipyrad in 2017. These include the species that were tested in each run. The tutorial at https://github.com/dereneaton/pyrad outlines how the pipeline and analyses are done. Example Input files: input.24.C16PlIn.txt Example Output file - taxa tested are included in the file: out.24.C16PlIn.D4.txt