This DATSETNAMEreadme.txt file was generated on 2021-04-15 by Gordon Burleigh GENERAL INFORMATION 1. Title of Dataset: A target enrichment probe set for resolving the flagellate land plant tree of life 2. Author Information Principal Investigator Contact Information Name: Gordon Burleigh Institution: University of Florida Address: P.O. Box 118526 / Gainesville, FL 32611 Email: gburleigh@ufl.edu 3. Date of data collection (single date, range, approximate date) : 2018-2020 4. Geographic location of data collection : Gainesville, FL 5. Information about funding sources that supported the collection of the data: This project was supported by NSF DEB‐1541506 SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: 2. Links to publications that cite or use the data: https://doi.org/10.1002/aps3.11406 3. Links to other publicly accessible locations of the data: NONE 4. Links/relationships to ancillary data sets: The raw reads are in the NCBI SRA; BioProject PRJNA630729 5. Was data derived from another source? yes/no A. If yes, list source(s): 6. Recommended citation for this dataset: Breinholt, J.W., S.B. Carey, G.P. Tiley, E.C. Davis, L. Endara, S.F. McDaniel, L.G. Neves, E.B. Sessa, M. von Konrat, S. Chantanaorrapint, S. Fawcett, S.M. Ickert-Bond, P.H. Labiak, J. Larraín, M. Lehnert, L.R. Lewis, N.S. Nagalingum, N. Patel, S.A. Rensing, W. Testo, A. Vasco, J.C. Villarreal, E.W. Williams, J.G. Burleigh. 2021. Target enrichment probe set for resolving the flagellate land plant tree of life. Dryad Dataset. https://doi.org/10.5061/dryad.7pvmcvdqg. DATA & FILE OVERVIEW 1. File List: FOLDER: Appendix_Tables: Tab-delimited text files of the appendix tables from Breinholt et al., 2020. Appendix_Table_Legends contains the legends for each of these tables. FOLDER: GoFlag_ProbeSets: The actual probe sets described by and used in Breinholt et al., 2020. FOLDER: MatricesAndTrees: Contains the phylogenetic datasets in Phylip format used to construct the phylogenetic trees in Figure 4, Appendix S5, and Appendix S7 in Breinholt et al. It also contains a text file with the gene boundaries for each phylip dataset, and a newick tree file (annotated with branch lengths and bootstrap support values) of the maximum likelihood trees in each file. In the Locus_Alignments FOLDER: The individual phylip alignment files for target region of each locus in the probe set. At the end of the processing pipeline there is sometimes more than one sequence per sample. When this happens, in the "Keep1" files, I keep only a single sequence with the most nucleotides; in the "NoDups" files, I remove all sequences from samples with more than one sequence. In the FOLDER Table2_Congener_Supermatrices files contain phylip alignment files for the congeners used to make Table 2 in Breinholt et al. (2021). FILE:GoFlag_PipelineTemplate.tar.gz - Compressed folder that contains the scripts and files (including reference sequences) used to process the genomic raw reads for each sample into a phylogenetic dataset. The pipeline is described in Breiinholt et al. (2021), and also in more depth in the "Supplemental_Methods" file on Dryad. FOLDER: PostProcessingScripts - Perl scripts used to further process files output from GoFlag_PipelineTemplate into phylogenetic datasets. These scripts are described in the "Supplemental_Methods" file on Dryad. FILE: Supplemental_Methods.txt or Supplemental_Methods.pdf is a Word document that provides details about the pipeline for processing the raw targeted enrichment sequence data and some of the post-processing scripts. FILE: Genome_Data_EachLocus.xlxs - Excel file with information about each of the nuclear loci covered by the probe set. 2. Relationship between files, if important: N/A 3. Additional related data collected that was not included in the current data package: NONE 4. Are there multiple versions of the dataset? NO METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: See Breinholt et al. (2021) - https://doi.org/10.1002/aps3.11406 2. Methods for processing the data: See Breinholt et al. (2021) - https://doi.org/10.1002/aps3.11406 - and "Supplemental_Methods.txt" in Dryad 3. Instrument- or software-specific information needed to interpret the data: NONE 4. Standards and calibration information, if appropriate: N/A 5. Environmental/experimental conditions: N/A 6. Describe any quality-assurance procedures performed on the data: See https://doi.org/10.1002/aps3.11406 7. People involved with sample collection, processing, analysis and/or submission: Breinholt, J.W., S.B. Carey, G.P. Tiley, E.C. Davis, L. Endara, S.F. McDaniel, L.G. Neves, E.B. Sessa, M. von Konrat, S. Chantanaorrapint, S. Fawcett, S.M. Ickert-Bond, P.H. Labiak, J. Larraín, M. Lehnert, L.R. Lewis, N.S. Nagalingum, N. Patel, S.A. Rensing, W. Testo, A. Vasco, J.C. Villarreal, E.W. Williams, J.G. Burleigh