Data from: Phylogenetic inference from atomised 3D morphometric data: a case study using kangaroos
Abstract
Reconstructing phylogeny from morphological data remains mired in investigator biases, including subjective inclusion and discretisation of phenotypic variation. Geometric morphometrics and multivariate statistical analyses provide an alternative array of tools for studying variation in morphological traits. However, direct analysis of landmark data is often unreliable for phylogeny reconstruction. Morphological variation is typically highly correlated among nearby landmarks and may evolve saltationally between adaptive peaks instead of gradually, thereby violating the assumptions of typical continuous models. To address these concerns, we developed an approach to more objectively discretise morphometric data and applied it to 3D surface scans of mandibles and postcranial elements of Macropodiformes (kangaroos, bettongs, and rat-kangaroos). These scans were partitioned into sets of locally co-varying landmarks which approximate functional units. These “atomised” characters were then discretised using novel approaches to combine the objectivity of continuous shape variation for delineating discrete states with the model flexibility offered for multi-state characters. This allows us to (1) potentially reduce the influence of non-independence among neighbouring landmarks, (2) accommodate multimodal variation from saltational evolution, (3) accommodate missing data, such as from fragmentary fossils, and (4) promote tree-search efficiency. We built discrete morphological character matrices using three alternative approaches: commonly used clustering algorithms (UPGMA, *k-*means, *k-*medoids, Gaussian mixture modelling), a minimum evolution branch length criterion, and a tree sampling procedure. Our phylogenetic analyses with these novel matrices generally succeeded in recovering genera and several deep-level macropodiform clades, but failed to accurately reconstruct intergeneric relationships within the rapid diversification of the macropodine sub-family; those relationships were also not recovered with continuous morphological data or traditionally discretised characters and are the most poorly resolved with DNA data. On balance, our atomised characters, which derive from only mandibular and three postcranial elements, show promise for improving objectivity, accuracy and clocklikeness in morphological phylogenetics and provide pathways for accommodating correlated homoplasy and for more accurately estimating rates of morphological evolution, and thereby better integrating phenotypic and genomic data for phylogenetic inference.
https://doi.org/10.5061/dryad.rn8pk0pm4
Description of the data and file structure
Supplementary materials providing landmark definitions, raw landmark and mean rotated landmark coordinates, phylogenetic character matrices (available in data.zip), R code for analyses. See main paper and supplementary information for more details.
Files in data.zip:
Character matrices:
phylogenetic character matrices for analysis in PAUP*, IQ-TREE, TNT and ContML.
Matrices are coded as multistate or binary characters resulting from discretization using clustering approaches, a branch length criterion, or tree sampling (supertree) approach described in the main manuscript and supplementary information.
Landmark definitions:
landmark_definition_all.xlsx
Landmark definition for each subregion with corresponding sliding semilandmarks (Subregion_mandible.R; Subregion_scapula.R; Subregion_humerus.R; Subregion_femur.R)
raw landmark coordinates:
raw landmark coordinates for the mandible, scapula, humerus and femur.
mean rotated landmark coordinates:
Mean rotated landmark coordinates for each subregions of the mandible, scapula, humerus and femur. Each .csv file correspond to a matrix containing landmark coordinates displayed as rows with V1, V2, V3; V4, V5, V6... corresponding to x1, y1, z1; x2, y2, z2... Note that landmark numbers restart from one for each subregion.
Scalar characters:
scalar characters for the mandible, scapula, humerus and femur. See supplementary material for more definitions. Units are in mm unless otherwise specified.
mandible_scalar.csv
dentaryMaxwidth: maximum width of the dentary
dentaryMinwidth: minimum width of the dentary
articular_width: width of the articular process
articular_length: length of the articular process
teethwidth: width of teeth row
teeth_row_height: height of teeth row
ForamenLength: length of mandibular foramen
RowWidth_at_m4: width of teeth row at m4
Char 31: Dentary minimum width relative to maximum width
Char 32: Articular process width relative to its length
Char 33: Width of the mandibular fovea relative to width of the teeth row behind m4
Char 34: Teeth row width relative to its height
scapula_scalar.csv
infraspinous_Area: area of infraspinous fossa (in mm2)
supraspinous_Area: area of supraspinous fossa (in mm2)
infra_width: width of infraspinous fossa
infra_length: length of supraspinous fossa
infra_width_at_neck: width of infraspinous fossa at scapular neck
spine_width: width of scapular spine at neck
glenoid_length: length of glenoid articular surface
glenoid_length_total: total length of glenoid
Char 35: Area (convex hull) of supraspinous (18 landmarks on outline) relative to infraspinous fossa (14 landmarks on outline)
Char 36: Infraspinous fossa width relative to its length
Char 37: Spine width at neck relative to infraspinous fossa
Char 38: Angle at base of scapular spine of line passing by the superior angle and line passing by inferior angle (in degrees)
Char 39: Angle at scapular neck of line passing by superior border and line passing by coracoid
Char 40: Glenoid- length of the articular surface relative to total length
humerus_scalar.csv
VOL_medTub: volume of medial tuberosity relative (in mm3)
VOL_latTub: volume of lateral tuberosity (in mm3)
olecranonLength: olecranon fossa length
trochleaLength: trochlea length
proxdeltoLength: length of deltoid tuberosity to proximal end of humerus
deltodistL: length of deltoid tuberosity to distal end of humerus
totalLength: total humerus length
proxepicondyle_Length: width of proximal epiphysis
distepicondyle_Length: width of distal epiphysis
medepiicondyle_Length: width of medial epicondyle
supinatorLength: length of the supinator crest
Head_length_sagittal: length of humeral head in sagittal plane
Head_length_transvers: length of humeral head in transverse plane
Char 41: Volume of medial tuberosity relative to volume lateral tuberosity
Char 42: Olecranon fossa length relative to trochlea length
Char 43: Extension of deltoid tuberosity relative to the total humerus length
Char 44: Width of medial epicondyle relative to width of proximal epiphysis
Char 45: Width of distal epiphysis relative to width of proximal epiphysis
Char 46: Length of the supinator crest to distal epiphysis relative to total humerus length
Char 47: Trochlear angle formed by distal extremities and upper middle part in medial view of the humerus
Char 48: Proximal epiphysis- ratio of length of frontal section relative to sagittal section
femur_scalar.csv
troch_length: trochanteric fossa length
troch_width: trochanteric fossa width
length_prox.muscle: length of muscle attachment fovea to the proximal end of the femur
length_muscle.dist: length of muscle attachment fovea to the distal end of the femur
circonference length: femur shaft minimum perimeter
TibiaLength
Char 49: Trochanteric fossa width relative to its length
Char 50: Length of muscle attachment fovea to the proximal end relative to total length of the femur
Char 51: Shaft minimum perimeter relative to total length of the femur
Char 52: Tibia length relative to femur length
Software/Code
character_discretisation.R
R script necessary to run the discretization approaches on each of the rotated landmark coordinates for the subregions of the mandible, scapula, humerus and femur (files available in mean rotated landmark coordinates). R 4.2.1 was used to run the analyses using the libraries ape, cluster, ggplot2, ggrepel, mclust, phangorn, shape, vegan.
Access
Links to other publicly accessible locations of the data:
https://github.com/melinacelik/Phylogenetic-Inference-from-atomised-3D-morphometric-data
