This information corresponds to genetic and geographic data associated with the Parsons et al. PNAS manuscript, ‘Analysis of biodiversity data suggests that mammal species are hidden in predictable places’ (2022). Dataset information has been split into two sections (genetic and geographic) as described below. #### GENETIC DATA #### The genetic data has been compressed into a single zip file titled 'genetic_data.zip' - Sequence alignments are organized taxonomically in folders corresponding to orders and families. - Within each family folder there are alignment files titled 'COI_unique.fas" and "cytb_unique.fas" that contain family-level alignments for COI and cytb, respectively. NOTE: Certain family level folders may be split into smaller taxonomic groups (e.g., clades, tribes, etc.), depending on the number of DNA sequences available. NOTE: Certain groups may be missing one or both files, indicating that there was not enough sequence data to generate an alignment for the particular group. #### GEOGRAPHIC DATA #### The geographic data has been compressed into a single zip file titled 'geographic_data.zip' - Occurence records are organized taxonomically in folders corresponding to orders and families. - Within each family folder there are species files containing the corresponding species geographic data. - Files titled "genus_species.csv" contain a list of all coordinates gathered for said species. - Files titled "genus_species_clean.txt" contain the results of Coordinate Cleaner curation. - Each row after the species name represents a specific flag. - For each entry, row values of "FALSE" indicate the occurrence was flagged and discarded and values of "TRUE" indicate the occurrence was included in analysis. - See the CoordinateCleaner (https://cran.r-project.org/web/packages/CoordinateCleaner/CoordinateCleaner.pdf) documentation for details regarding the flags used. - Files titled "genus_species_var.txt" contain values extracted from GIS layers using the species curated occurrence records. - Please see the file 'supplemental_var.xlsx' for more detailed information on GIS layers used and descriptions of relevant values extracted. - Each family folder also contains a "family_var.txt" file, which contains values from all of the "genus_species_var.txt" files located within the family folder. - Each order folder also contains an "order_var.txt" file, which contains values from all of the "family_var.txt" files located within the order folder. NOTE: Certain groups may be missing one or more files, indicating that there was not enough occurrence data to generate occurrence files for the particular group.