SeedTraitsofSeedsDispersedbyDungbeetlesandPrimatesreadme This SeedTraitsofSeedsDispersedbyDungbeetlesandPrimatesreadme.txt file was generated on 2021-11-29 by Karen Marie Pedersen GENERAL INFORMATION 1. Title of Dataset: Seed Traits of Seeds Dispersed by Dungbeetles and Primates 2. Author Information A. Principal Investigator Contact Information Name: Karen Marie Pedersen Institution: Technische Universität Darmstadt Address: Schnittspahnstraße 3 64287 Darmstadt Email: karenpedersen2@gmail.com B. Associate or Co-investigator Contact Information Name: Nico Blüthgen Institution: Technische Universität Darmstadt Address: Schnittspahnstraße 3 64287 Darmstadt Email: bluethgen@bio.tu-darmstadt.de 3. Date of data collection: <2019-01-07 to 2019-06-10> 4. Geographic location of data collection : 00°31′2′′N, 79°12′13′′W SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: None 2. Links to publications that cite or use the data: https://doi.org/10.1111/btp.13052 3. Links to other publicly accessible locations of the data: https://data.kew.org/ 4. Links/relationships to ancillary data sets: 5. Was data derived from another source? yes/no A. If yes, list source(s): YES 1) Cornejo, F., & Janovec, J. (2010). Seeds of Amazonian plants. Princeton University Press. 2) Royal Botanic Gardens Kew (2020). Seed Information Database (SID). Version 7.1. https://data.kew.org/sid/ 6. Recommended citation for this dataset: Pedersen, K. M., & Blüthgen, N. (2021). Data from: Seed size and pubescence facilitate secondary dispersal by dung beetles. Dryad Digital Repository DATA & FILE OVERVIEW 1. File List: 2. Relationship between files, if important: a) Liturature Data Sets "Seed Traits.csv" and "dispersal syndromes KEW.csv" are linked by the genus name. b) Field Collected Data Sets "Traits.csv" and "Dung Ball Seeds New ID Pub.csv" connect with the column named Seed Name "Traits.csv" and "Monkey Seeds New ID Pub.csv" also connect with the column named Seed Name "Traits.csv" and "Direct Dung Beetle Dung Observations.csv" do not directly connect until DB is added to the Collection Number, and allows us to identify the beetle species responsible for the dung ball 3. Additional related data collected that was not included in the current data package: 4. Are there multiple versions of the dataset? yes/no Yes on a personal computer but for upload no, GPS points are masked with column "GPS" from file Direct Dung Beetle Dung Observations Masked.csv, converted to NA and columns "Lat" and "Long" rounded to the nearest degree. METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: a) The data file "Seed Traits.csv" was generated from the key in: Cornejo, F., & Janovec, J. (2010). Seeds of Amazonian plants. Princeton University Press. b) The data file "dispersal syndromes KEW.csv" was gernerated from Royal Botanic Gardens Kew (2020). Seed Information Database (SID). Version 7.1. https://data.kew.org/sid/. The dispersal data for each none wind disprsed genus included in "Seeds of Amazonaian Plants", was matched against the SID, to extract dispersal information, and then a logical variable created "YES/NO", for mammal dispersal. c) Data from the remaining data files was generated from monkey fecal samples, and dung beetle dung balls collected in the field. 2. Methods for processing the data: a) The seed traits from the "Seed Traits.csv" file is all taken from the genus identification key, using the characters defined by the book, size, shape, color, and surface. Some genera have more than one combination of characters. b) For the data file "dispersal syndromes KEW.csv" dispersal data for each none wind dispersed genus included in "Seeds of Amazonaian Plants", was matched against the SID, to extract dispersal information, and then a logical variable created "YES/NO", for mammal dispersal. c) Remaining data files are from field-collected data. In the field, the monkey species that produced the feces was identified, and if the sample was a dung ball the beetle was collected with the ball for identification. Fecal samples and dung balls were dissected to remove the seeds. The seeds were then grouped by morphospecies and identified to the genus as well as possible. Seed length and width were measured, and the seed surface was characterized. 3. Instrument- or software-specific information needed to interpret the data: R version:RStudio 2022.07.1+554 "Spotted Wakerobin" and Packages to run scripts dplyr, stringr, bipartite, weights, MASS, bipartite, plyr, and sbinning. Details are included at the top of the individual code files. 4. Environmental/experimental conditions: Samples were collected in the Ecuadorian Choco and then processed in the Lab in Germany. Please see Pedersen, Karen M., and Nico Blüthgen. "Seed size and pubescence facilitate secondary dispersal by dung beetles." Biotropica 54.1 (2022): 215-225. 5. Describe any quality-assurance procedures performed on the data: Samples with too much damage were not processed, data was double checked compared to physical samples and then again for sensibleness in silico 6. People involved with sample collection, processing, analysis and/or submission: Data processing: Karen Marie Pedersen, Nico Blüthgen, Andrea Hilpert, Maxim Ionov Data Analysis: Karen Marie Pedersen, Nico Blüthgen Sample collection: Karen Marie Pedersen,Citlalli Morelos-Juarez, Argoti Avila, Bryan Xavier Tamayo Zambrano, Jorge Alipio Zambrano Velez, José Amado De la Cruz Chávez, José Roberto de la Cruz Loor, Alcides Agustín Zambrano Velez, José Manuel Añapa Añapa, Ronaldo Mesías, Vanessa Moreira, Daniel Velázquez, and Yadira Giler. 1) DATA-SPECIFIC INFORMATION FOR: [Direct Dung Beetle Dung Observations Masked.csv] 1. Number of variables: 17 2. Number of cases/rows: 183 3. Variable List: Variables: 1)Collection.Number, this is the number that corresponds to the label on the sample tube or seed envelope, values can range from 1:183 and should be integers 2)Monkey, a categorical variable of "Howler" or "Spider", "Howler" corresponds to "Alouatta palliata" and Spider to "Ateles fusciceps fusciceps" 3)Field.Guess, this is a best guess at dung beetle species made in the field 4)Species, this is the actual species name of the dung beetle after being identified in the lab 5)Ball.Weight, is a measure in grams of how much the collected dung ball weighed after being dried in silica 6)Beetle.Weight, is a measure in grams of the weight of the dung beetle after being dried in silica 7)Gender, if it was possible to assign a gender to an individual it is recorded here. M is used to indicate a male and F to indicate a female 8)Pair, this is a Y/N, where Y indicates two beetles were collected together as a pair and an N indicates a solitary beetle 9)Date, this is the collection date of the beetle or dung ball or both. 10)Sample, this indicates if there was a physical collection of either a beetle or a dung ball. Y indicates that there was a collection. 11)Sample.Type, indicates what was collected in the field, "Ball" indicates a dung ball was collected, "Beetle" indicates a beetle was collected, "Seeds" indicate that only seeds were collected, "Beetle and Ball" indicate the collection was for both a dung beetle(s) and a dung ball 12)Behavior, is an in the field impression of nidification behavior, "Tunneler" indicates the beetle started to make a tunnel not too far from the original dung pat by pushing sections of dung, "Roller" indicates that the beetle made a ball and rolled it away, "Tunneler?" Indicates that the observation has some degree of uncertainty, "Roller/Tunneler" indicates a mixture of behavior types. 13)GPS, this is NA in this data set to protect the location information of the critically endangered "Ateles fusciceps fusciceps" 14)Lat, this Latitude information, and is rounded to the nearest degree to protect the location information of the critically endangered "Ateles fusciceps fusciceps" 15)Long, this is Longditude information and is rounded to the nearest degree to protect the location information of the critically endangered "Ateles fusciceps fusciceps" 16)Notes, additional notes about observations including some natural history 17)Additional.Notes, other notes about the quality of specimen etc 4. Missing data codes: n NA 5. Specialized formats or other abbreviations used: 2) DATA-SPECIFIC INFORMATION FOR: [dispersal syndromes KEW pub.csv] 1. Number of variables: 5 2. Number of cases/rows: 1165 3. Variable List: 1) Genus, Plant Genus 2) Species, Plant Species 3) Dispersal method, what is the method of dispersal for those plants or animals 4) Dispersers, if there is an animal disperser they are included here 5) Disp. by mammals, a YES/NO vector. YES indicates that this species is mammal dispersed and NO indicates that this plant is not mammal dispersed 4. Missing data codes: n NA 5. Specialized formats or other abbreviations used: 3) DATA-SPECIFIC INFORMATION FOR: [Dung Ball Seeds New ID Pub.csv] 1. Number of variables: 10 2. Number of cases/rows: 251 3. Variable List: 1) Sample.ID, number on the seed envelope 2) Seed.Count, number of seeds counted within that morphospecies within that sample 3) Size, a very approximate size in mm ( we suggest using the more precisely measured sizes in the "Traits.csv" 4) Seed.Genus, the morphospecies name for the seed 5) Monkey.Species, a categorical variable of "Howler" or "Spider", "Howler" corresponds to "Alouatta palliata" and Spider to "Ateles fusciceps fusciceps" 6) NewID, adds the indicator to the Sample.ID that this data point came from a dung ball "DB" so that it can be used with"Monkey Seeds New ID Pub.csv" 7) Seed.Name, if not NA then the name of a seed within the sample that was photographed 8) Ball.Number, the number that links the seed envelopes to the dung beetle vials, or pinned specimens 9) Folder.Name, if there is a named seed then this is the name of the folder in which its photo is saved 10) Comment, additional information 4. Missing data codes: n NA 5. Specialized formats or other abbreviations used: 4) DATA-SPECIFIC INFORMATION FOR: [Monkey Seeds New ID Pub.csv] 1. Number of variables: 9 2. Number of cases/rows: 125 3. Variable List: ) Sample.ID, number on the seed envelope 2) Seed.Count, number of seeds counted within that morphospecies within that sample 3) Size, a very approximate size in mm ( we suggest using the more precisely measured sizes in the "Traits.csv" 4) Seed.Genus, the morphospecies name for the seed 5) Monkey.Species, a categorical variable of "Howler" or "Spider", "Howler" corresponds to "Alouatta palliata" and Spider to "Ateles fusciceps fusciceps" 6) NewID, adds the indicator to the Sample.ID that this data point came from a monkey fecal sample "F" so that it can be used with "Dung Ball Seeds New ID Pub.csv" 7) Seed.Name, if not NA then the name of a seed within the sample that was photographed 8) Ball.Number, the number that links the seed envelopes to the dung beetle vials, or pinned specimens 9) Folder.Name, if there is a named seed then this is the name of the folder in which its photo is saved 4. Missing data codes: NA 5. Specialized formats or other abbreviations used: 5) DATA-SPECIFIC INFORMATION FOR: [quartiles.csv] 1. Number of variables: 14 2. Number of cases/rows: 56 3. Variable List: 1) obs_ID 2) species 3) meanArea 4) sd 5) numbOfSeeds 6) sumOfSeeds 7) inDungball 8) inPoo 9) isPubescent 10) isStriate 11) isSmooth 12) isCanthon 13) isOxysternon 14) quartile 4. Missing data codes: n 5. Specialized formats or other abbreviations used: 6) DATA-SPECIFIC INFORMATION FOR: [Seed Traits.csv] 1. Number of variables: 9 2. Number of cases/rows: 579 3. Variable List: 1) Family, Plant Family 2) Genus, Plant Genus 3) Size Class, Seed Size class where, 0.5 is seeds from 0-0.5 cm in length, 0.99 corresponds to seeds greater than 0.5 cm in length unto 0.99 cm, 1.99 corresponds to seeds greater than 0.99 cm and unto 1.99 cm, and 2 corresponds to seeds of 2 cm and longer 4) Color, a color classification for seeds 5) Shape, Elongate, Round, Flat, Irregular 6) Surface, a descriptive variable for the seed surface 7) Hairy, Yes/No variable where "Yes" corresponds to seeds with a pubescent surface and "No" corresponds to all other surface types 8) Mammal Dispersed, Yes/No variable where "Yes" corresponds to seeds with a literature record of mammal dispersal, and "No" indicates that there is not a record of mammal dispersal. 9) Source, name of the source of the data 4. Missing data codes: n NA 5. Specialized formats or other abbreviations used: 7) DATA-SPECIFIC INFORMATION FOR: [Traits.csv] 1. Number of variables: 9 2. Number of cases/rows: 228 3. Variable List: 1) Morphospecies, folder where photo of seed is stored 2) Species, morphospecies name 3) Seed Name, name of photo of individual seed 4) Source, categorical variable of DB/F, where DB indicates the seed came from a dung ball, and F indicates it comes from a monkey fecal sample 5) Length, measured longest length of the seed in mm 6) Width, 90 degrees from the longest length the width is measured in mm 7) Surface, Pubescent (hairy surface),Smooth ( an undisrupted surface), or Straite ( disrupted surface with grooves or other things disruptions) 8) Color, seed color 9) Shape, seed shape 4. Missing data codes: n NA 5. Specialized formats or other abbreviations used: