Skip to main content

Data Matrices Nukukammophis

Cite this dataset

Caldwell, Michael (2021). Data Matrices Nukukammophis [Dataset]. Dryad.


Reassessment of the anatomy of a three-dimensionally preserved, partial braincase from the Lower Cretaceous (lower Valanginian; ~139 mya) Kirkwood Formation of the Kirkwood Cliffs Quarry, South Africa, reveals it to be that of a new fossil snake species. The braincase presents a number of squamate skull features but also specific braincase features shared with early snakes such as Najash, Dinilysia, Sanajeh, and Menarana (e.g., presence of a large fenestra ovalis; fenestra rotunda separated from fenestra ovalis by prominent crista interfenestralis; reduced basal tubera on basioccipital). Phylogenetic relationships were analyzed using two distinct data sets and under different optimality criteria, recovering this new snake as an early-evolving lineage with close affinities to other Cretaceous Gondwanan snakes including Najash, Dinilysia, and other typical madtsoiids. This three-dimensionally preserved braincase reveals important aspects of early snake cranial evolution that were previously unknown from the oldest known fossil snakes. This specimen extends the fossil record of Gondwanan snakes by at least 45 my, making it the oldest known African and Gondwanan fossil snake and supporting a much broader palaeobiogeographical distribution of early snakes across ancient Gondwanan continents during the Early Cretaceous.


(a) Phylogenetic data sets

Two phylogenetic data sets were used to investigate the relationships of the new fossil snake with other snake and non-snake lizards. Data Set 1 is a recent snake phylogeny data set [3] with a number of updates made herein. For this analysis, we included only fossil taxa that preserve associated cranial materials; thus, chimaeric taxa such as Coniophis [22], or taxa lacking cranial remains of any kind such as Xiaophis [13], were excluded. Similarly, Nanowana was only scored for skull characters as the type or paratype do not include associated vertebrae [18,19]. In total, 35 taxa comprising early-evolving snakes (17 extinct and 18 extant) were scored for 247 osteological characters (see electronic supplementary materials). Two versions of Data Set 1 were used: one consisting of the complete data set and one excluding character 217 (absence/presence of parazygantral foramen) in order to test the effect of that character on retrieving, or not, a monophyletic Madtsoiidae. The presence of parazygantral foramina has been considered a strong synapomorphy of the Madtsoiidae [19,23–25]; however, their presence is not restricted to madtsoiids and they can also be absent in taxa considered closely related (e.g., Dinilysia), indicating that this character is variably distributed, possibly within a more inclusive clade than just the conventional Madtsoiidae [26].

Data Set 2 is a much broader early diapsid and squamate data set [12] that includes recently added fossil and extant snake taxa [3], and thus differs from the original matrix [12] (see electronic supplementary materials). The use of this data set is intended to investigate the placement of snakes among other squamates, and their divergence times, based on morphological and molecular data.

(b) Maximum parsimony analyses (MP)

Analyses were conducted in TNT 1.5-beta [27]. For Data Set 1, all heuristic searches were performed under equal weights and consisted of 1000 rounds of random addition sequence (RAS) of taxa followed by Tree Bisection Reconnection (TBR) branch swapping, holding ten trees per replication and collapsing branches of zero length after tree search. The resulting trees were subjected to a final round of TBR branch swapping and optimal topologies were rooted with Varanus (see electronic Supplementary Materials). Node support was calculated by the Bremer support and under absolute frequencies by jackknifing (p = 0.36, 1000 pseudo-replicates). For Data Set 2, we utilized all New Technology Search algorithms available in TNT, following the same procedures as in [12].

(c) Non-clock Bayesian inference (BI) analyses

Non-clock BI analyses were conducted for Data Set 1 and Data Set 2 using Mr. Bayes v. 3.2.7a [28] via the CIPRES Science Gateway v.3.3 [29]. For Data Set 2, as there were no changes to the molecular data set previously used, molecular partitions and models of evolution were the same as in [3]. The morphological partition was analyzed with the Mkv model [30].

(d) Time-calibrated relaxed-clock Bayesian inference analyses (TED-BI)

For Data Set 2, we implemented “total-evidence-dating” using the fossilized birth-death (FBD) tree model allowing for sampled ancestors, under a relaxed-clock model in Mr. Bayes v.3.2.6 [31–34], following the same procedures as in [3,12]. We chose to use the Independent Gamma Rate (IGR) relaxed-clock model [31–34]. The base of the clock rate was given an informative prior based on the previous non-clock analysis: the median value for tree height in substitutions from posterior trees divided by the age of the tree based on the median of the distribution for the root prior: 29.4493/325.45 = 0.09048, in natural log scale = -2.402626. We chose to use the exponent of the mean to provide a broad standard deviation: e0.09048 = 1.094699. Sampling strategy was set to diversity, which is more appropriate when extant taxa are sampled in a way to maximizes diversity (as performed herein) and fossils are sampled randomly [31,32]. The vast majority of our calibrations were based on tip-dating, which accounts for the uncertainty in the placement of fossil taxa and avoids the issue of bound estimates for node-based age calibrations [31]. The fossil ages used for tip-dating correspond to the uniform prior distributions on the age range of the stratigraphic occurrence of the fossils (available in electronic supplementary materials). For the clades for which we lacked some of the oldest known fossils in our analysis, and for which there is overwhelming support in the literature (and in all our other analyses) regarding their monophyly, and for which the age of the oldest known fossil is well-established, we employed node age calibrations with a soft maximum age calibration (available in electronic supplementary materials). The age of the root was set with a soft lower bound to the same age as in [3,12].

Initial runs of Data Set 2 found two taxa, Paliguana and Pamelina, to behave as wildcards, as also previously detected in previous studies [12,35,36]; we therefore removed those two taxa for the final relaxed clock BI analysis reported here.

For all BI analyses, convergence of independent runs was assessed using the average standard deviation of split frequencies (ASDSF ≈ 0.01), potential scale reduction factor (PSRF ≈ 1 for all parameters), and effective sample size (ESS > 200 for each parameter). All input and output files (summary trees, log files, etc.) from all analyses (including Mr. Bayes blocks with all prior and analytical parameters) are available online as electronic supplementary materials.


Natural Sciences and Engineering Research Council, Award: 23458