Data, code, and supplementary information for Hyracoid locus identification
Data files
Mar 28, 2024 version files 103.24 KB
-
hyracoidea_fayum_locus_comparisons.xlsx
17.64 KB
-
modern-procavia-lowers.xlsx
13.33 KB
-
modern-procavia-uppers.xlsx
13.02 KB
-
published-tooth-sizes.xlsx
41.97 KB
-
README.md
7.45 KB
-
test_case_meroehyrax.xlsx
9.84 KB
Abstract
Serially homologous elements pose an identification problem in fragmentary records, particularly those of vertebrate fossils. Examples include individual vertebrae in the vertebral column and teeth in a tooth row. Until an isolated element can be accurately attributed to a specific position within its series, multiple lines of ecological and evolutionary research cannot be conducted. However, varying levels of differentiability between loci, and varying patterns of differentiation across clades, make it impossible to develop a single set of diagnostic traits for any particular set of serial homologs, particularly mammalian molars. Here, we test the utility of a set of classification criteria for distinguishing molar tooth positions of hyraxes (Mammalia, Afrotheria, Hyracoidea), which have been considered indistinguishable in previous taxonomic studies. As part of the test, we evaluate the degree to which between-locus variation is conservative in this taxon, which would strengthen the predictive power of proposed traits even in cases where species identity is unknown. Suitable tests for hypotheses of conservatism in categorical traits did not exist, to our knowledge, and we, therefore, explored the behavior of previously developed metrics, Borges et al.’s δ, to assess conservatism in contrast to the phylogenetic signal produced by Brownian motion. This metric shows some promise but the nature of resulting distributions makes tests difficult to interpret, indicating a line of potential future methods improvement. We used a linear morphometric characterization of shape to validate the candidate traits. In the case of hyracoid molars, relatively simple ratios of linear measurements have strong discriminatory power despite evolutionary variation in between-locus differences. Overall, new or understudied taxa are likely to have lower molar loci differentiable by their relative length and talonid vs. trigonid width.
https://doi.org/10.5061/dryad.n2z34tn40
This repository includes
- supplementary metadata describing the sources of specimen data and age information for species included in phylogenetic comparative analyses
- copies of input data (measurements) suitable for replicating analytical results
- an archived copy of code used to generate the results found in the published study.
Description of the data and file structure
File list:
published-tooth-sizes.xlsx
(measurements in millimeters (mm) collated from the literature, complete citations in associated publication)- institution and number refer to the museum accession code for each specimen
- species refers to species. The genus currently associated with each species is given in a separate sheet called ‘GenusKey’
- publication refers to the publication from which measurements were collated
- page refers to the page number within the publication on which the measurement was published
- side refers to whether the tooth measured comes from the left (l) or right (r) side of the specimen or if the reported measurement is an average (a) of both sides
- in measurement columns:
- uppercase refers to upper teeth, lowercase refers to lower teeth
- ‘I’ = incisor, ‘C’ = canine, ‘P’ = premolar, ‘M’ = molar
- ’#’ refers to the position of the tooth within a dental field (ex: m1 = first molar, m2 = second molar, m3 = third molar)
- “L” refers to the mesiodistal length (mm) of the tooth crown
- “W” refers to the buccolingual width (mm)
- empty cells refer to measurements not reported, likely either because that tooth position was not part of the specimen or because damage prevented a particular measurement from being taken.
modern-procavia-uppers.xlsx
(measurements in millimeters (mm) newly collected from 3D surface files of specimens of upper teeth of modern Procavia capensis)- Sample refers to the museum accession code for each specimen
- Side refers to whether the tooth measured comes from the left or right side of the specimen
- Position refers to the molar number (first molar, second molar, third molar)
- Filename refers to the 3D model file from which measurements were taken
- in measurement columns:
- measurements were taken in triplicate. The number appended to each column refers to the replicate number.
- ‘length’ refers to mesiodistal length. For more details on how measurements were taken please see methods and figures in the associated publication.
- ‘trigonid.width’ refers to maximum width of the trigonid
- ‘talonid.width’ refers to maximum width of the talonid
modern-procavia-lowers.xlsx
(measurements in millimeters (mm) newly collected from 3D surface files of specimens of lower teeth of modern Procavia capensis)- Sample refers to the museum accession code for each specimen
- Side refers to whether the tooth measured comes from the left or right side of the specimen
- Position refers to the molar number (first molar, second molar, third molar)
- Filename refers to the 3D model file from which measurements were taken
- in measurement columns:
- measurements were taken in triplicate. The number appended to each column refers to the replicate number.
- ‘length’ refers to mesiodistal length. For more details on how measurements were taken please see the methods and figures in the associated publication.
- ‘width.para’ refers to the width of the crown along the paracone-protocone crest
- ‘width.meta’ refers to the width of the crown along the metacone-hypocone crest
hyracoidea_fayum_locus_comparisons.xlsx
(measurements in millimeters (mm) newly collected from 3D surface files of specimens of fossilized hyracoids from the Fayum, Egypt, both upper and lower teeth)- ‘coll’ and ‘cat_num’ refer to the museum collection code and catalog number for each specimen
- ‘genus’ and ‘species’ refer to the Linnean binomial for each specimen
- ‘Side’ refers to whether the tooth measured comes from the left (L) or right (R) side of the specimen
- Position refers to the molar number (first molar, second molar, third molar)
- in measurement columns:
- ‘length’ refers to mesiodistal length. For more details on how measurements were taken please see the methods and figures in the associated publication.
- ‘width.para’ refers to the width of the crown along the paracone-protocone crest
- ‘width.meta’ refers to the width of the crown along the metacone-hypocone crest
- ‘trigonid.width’ refers to the maximum width of the trigonid
- ‘talonid.width’ refers to the maximum width of the talonid
- ‘talonid.length’ refers to the maximum length of the talonid
- ‘NA’ refers to measurements that could not be taken because of preservation issues or other problems.
test_case_meroehyrax.xlsx
(measurements in millimeters (mm) copied from the publication describing Meroehyrax kyongoi, Gutierrez and Rasmussen 2009, full citation in associated publication)- ‘m#’ refers to molar number (first molar, second molar, third molar)
- “L” refers to the mesiodistal length of the tooth crown
- “WM” refers to the mesial width
- “WD” refers to the distal width
- empty cells refer to measurements not reported, likely either because that tooth position was not part of the specimen or because damage prevented a particular measurement from being taken.
Sharing/Access information
This data is associated with the publication: Vitek, N.S., and P.M. Princehouse. In Review. Evaluating the utility of linear measurements to identify isolated tooth loci of extinct Hyracoidea. Acta Palaeontologica Polonica. DOI: TBD.
Please cite that publication if you use the data or code from this repository.
Corresponding Author:
Natasha S. Vitek
Department of Ecology & Evolution, Stony Brook University
natasha.vitek@stonybrook.edu
Code/Software
The code files are organized as follows for analysis:
Start at hyracoid_base.R
. This is the entry point that loads libraries, sets file paths, etc.
The next steps (which are automatically sourced in the base script) are in hyracoid_merge_data.R
. The first parts of this script source the formatting scripts and get the data in a state to be analyzed. The data merge script calls from among the following five scripts: literature_format_linear_data.R
, procavia_format_linear_data_lowers.R
, fayum_format_linear_data_lower.R
, procavia_format_linear_data_uppers.R
, and fayum_format_linear_data_uppers.R
.
The next part (hyracoid_analyses.R
) produces descriptive statistics, and then conducts analyses, including calling scripts that are specific to the upper or lower tooth arcade (because slightly different sets of variables are evaluated in each). The analytical code calls either of two sets:
- `hyracoid_analyses_lowers.R` & `hyracoid_analyses_LDA_lowers.R`
- `hyracoid_analyses_uppers.R` & `hyracoid_analyses_LDA_uppers.R`
as well as hyracoid_ratio_physig.R
which itself calls retention_borges_delta.R
This dataset contains raw data, a snapshot of code as analyzed, and associated supplementary data describing specimens, scans, and species used in a study attempting to diagnose tooth positions in isolated teeth of Hyracoidea. A live version of the code can be found in the GitHub repository 'hyracoid-locus-example' managed by the lead author (GitHub username: nsvitek). Full citations for associated references can be found in the manuscript associated with this dataset. Raw data consists of linear measures from multiple sources, including direct measurements collected from 3D surfaces in MeshLab and a compilation of published measurements from the literature. Separate datasets from separate sources are kept in separate spreadsheet files. Names correspond to their usage in code. All code was written for the R statistical computing language.