Skip to main content
Dryad

Data from: Learning to see the wood for the trees: machine learning, decision trees and the classification of isolated theropod teeth

Cite this dataset

Wills, Simon; Underwood, Charlie J.; Barrett, Paul M. (2021). Data from: Learning to see the wood for the trees: machine learning, decision trees and the classification of isolated theropod teeth [Dataset]. Dryad. https://doi.org/10.5061/dryad.1zcrjdfq9

Abstract

Taxonomic identification of fossils based on morphometric data traditionally relies on the use of standard linear models to classify such data. Machine learning and decision trees offer powerful alternative approaches to this problem but are not widely used in palaeontology. Here, we apply these techniques to published morphometric data of isolated theropod teeth in order to explore their utility in tackling taxonomic problems. We chose two published datasets consisting of 886 teeth from 14 taxa and 3020 teeth from 17 taxa, respectively, each with five morphometric variables per tooth. We also explored the effects that missing data have on the final classification accuracy. Our results suggest that machine learning and decision trees yield superior classification results over a wide range of data permutations, with decision trees achieving accuracies of 96% in classifying test data in some cases. Missing data or attempts to generate synthetic data to overcome missing data seriously degrade all classifiers predictive accuracy. The results of our analyses also indicate that using ensemble classifiers combining different classification techniques and the examination of posterior probabilities is a useful aid in checking final class assignments. The application of such techniques to isolated theropod teeth demonstrate that simple morphometric data can be used to yield statistically robust taxonomic classifications and that lower classification accuracy is more likely to reflect preservational limitations of the data or poor application of the methods.

Methods

Data to test the models was sourced from:

HENDRICKX, C., MATEUS, O. and ARAÚJO, R. 2015. The dentition of megalosaurid theropods. Acta Palaeontologica Polonica, 60, 627–642.

LARSON, DEREK W., BROWN, CALEB M. and EVANS, DAVID C. 2016. Dental Disparity and Ecological Stability in Bird-like Dinosaurs prior to the End-Cretaceous Mass Extinction. Current Biology, 26, 1325–1333.

Usage notes

Please read the README file for detailed instructions.

A R-script is also included which has been tested on R version 3.6.0 and 4.0.2