Data from: Machine learning confirms new records of maniraptoran theropods in Middle Jurassic UK microvertebrate faunas
Data files
Apr 20, 2023 version files 566.19 KB
-
README.txt
-
trainingData.csv
-
UKBathonianTeeth.csv
Abstract
Current research suggests that the initial radiation of maniraptoran theropods occurred in the Middle Jurassic, although their fossil record is known almost exclusively from the Cretaceous. However, fossils of Jurassic maniraptorans are scarce, usually consisting solely of isolated teeth, and their identifications are often disputed. Here, we apply different machine learning models, in conjunction with morphological comparisons, to a suite of isolated theropod teeth from Bathonian microvertebrate sites in the UK in order to determine if any of these can be confidently assigned to Maniraptora. We generated three independent models developed on a training dataset with a wide range of theropod taxa and broad geographical and temporal coverage. Classifying the Middle Jurassic teeth in our sample against these models indicates the presence of at least three distinct dromaeosaur morphotypes, plus a therizinosaur and troodontid, in these assemblages, a conclusion supported by morphological comparison. These new referrals significantly extend the ranges of Therizinosauroidea and Troodontidae, by some 27 million years. These results indicate that not only were maniraptorans present in the Middle Jurassic, as predicted by previous phylogenetic analyses, but had already radiated into a diverse fauna that pre-dated the break-up of Pangaea. This study also demonstrates the power of machine learning to provide quantitative assessments of isolated teeth in providing a robust, testable framework for taxonomic identifications, and highlights the importance of assessing and including evidence from microvertebrate sites in faunal and evolutionary analyses.
Methods
Four CSV and XLSX files of morphometric measurements from isolated theropod teeth from various sources. References for each individual tooth are contained within the dataset.
Training Data contains some 1,700 individual theropod tooth measurements to be used as input into the machine learning models.
UK Bathonian Teeth contains 94 individual theropod tooth measurements from Hornsleasow, Kirtlington and Woodeaton Quarries and Watton Cliff in the UK. These are Bathonian microvertebrate sites. The dataset only contains those teeth where we have complete measurements for the morphometric variables.
R-script to generate the machine learning models for classification.
README.txt containing information on how to generate the models.
Usage notes
All machine learning models were generated using R version 3.6.0 and above.
The following packages are required: caret, MASS, mda, randomForest, C50 and the script will attempt to download them and install if they are not already on the system.