Skip to main content

Data from: Palaeontology meets metacommunity ecology: The Maastricthian dinosaur fossil record of North America as a case study

Cite this dataset

García-Girón, Jorge et al. (2021). Data from: Palaeontology meets metacommunity ecology: The Maastricthian dinosaur fossil record of North America as a case study [Dataset]. Dryad.


Documenting the patterns and potential associated processes of ancient biotas has always been a central challenge in palaeontology. Over the last decades, intense debate has focused on the organisation of dinosaur–dominated communities, yet no general consensus has been reached on how these communities were organised in a spatial context and if primarily affected by abiotic or biotic agents. Here, we used analytical routines typically applied in metacommunity ecology to provide novel insights into dinosaurian distributions across the latest Cretaceous of North America. To do this, we combined fossil occurrences with functional, phylogenetic and palaeoenvironmental modelling, and adopted the perspective that more reasonable conclusions on palaeoecological reconstructions can be gained from studies that consider the organisation of biotas along ecological gradients at multiple spatial scales. Our results showed that dinosaurs were restricted in range to different parts of the Hell Creek Formation, prompting the recognition of discrete and compartmentalised faunal areas during the Maastrichtian at fine-grained scales, whereas taxa ranges formed quasi–nested groups when combining data from various geological formations across the Western Interior of North America. Although groups of dinosaurs had coincident range boundaries, their communities responded to multiple ecologically–important gradients when compensating for differences in sampling effort. Metacommunity structures of both ornithischians and theropods were correlated with climatic barriers and potential trophic relationships between herbivores and carnivores, thereby suggesting that dinosaurian faunas were shaped by physiological constraints and a combination of bottom-up and top-down forces across multiple spatial grains and extents.


Dinosaur occurrences for the Maastrichtian of North America were retrieved from the Palaeobiology Database <> on May 2020, using the taxon name ‘Dinosauria’ and a time span of 72.1 – 66.0 Ma. Critically, although studies on modern community associations are limited to relatively brief periods of sampling time, fossil assemblages are windows on the faunas of ancient worlds occurring within particular chronostratigraphic units (Benson et al. 2018). Although this coarse temporal resolution will undoubtedly confound the data (which is addressed in detail below), it would be problematic to subdivide the time bins further, not least because only a handful of fossil assemblages are sufficiently informative to provide confident community-level estimates so far (Vavrek & Larsson 2010). Additionally, due to an insufficient amount of comparative data within high–resolution time bins (Dean et al. 2020) and the inherent errors in radiometric dating (Gates et al. 2010), the creation of a more tightly constrained correlative window is presently impractical. Here, we only retained occurrences belonging to Ornithischia and Theropoda since these two clades were the most diverse and abundant non–avian dinosaur groups in the latest Cretaceous of North America (Brusatte et al. 2015). Generic–level identifications were used in our study, and all avian taxa were excluded when delineating community types to keep our data more comparable to previous works (e.g. Vavrek & Larsson 2010; Dean et al. 2020). While birds are phylogenetically part of the dinosaurian clade, the different habits and habitats of latest Cretaceous Avialae (either diving or volant taxa) separates these faunas enough from ground-dwelling dinosaurs to justify their functional distinction in the context of the communities modelled here (see Heino et al. 2015b for an example on present-day biotas). Although the value of generic taxonomic ranks in community analyses has been debated, palaeontologists have used generic–level clades to investigate distributional patterns and variation in community composition of fossil taxa (e.g. Vavrek & Larsson 2010; Chiarenza et al. 2019; Dean et al. 2020). Indeed, generic–level identifications are preferred over species taxonomic ranks in dinosaur palaeobiology studies as most dinosaur genera (c. 87%) are easily diagnosed and monospecific (Weishampel et al. 2004; Mannion et al. 2012). Moreover, genus-level and species–level diversity patterns generally appear to track each other for Mesozoic tetrapods (Barrett et al. 2009), and genera are more taxonomically stable than species for many groups (Robeck et al. 2000). Here, however, taxa with unclear genus identification were discarded (i.e. we did not incorporate ‘cryptic’ diversity represented by taxonomically undiagnostic fossil remains that potentially represent distinct taxa, nor we did infer ghost lineages based on phylogenetic diversity estimates; Barrett et al. 2009; Mannion et al. 2011), and so were collections lacking formational assignment. If questionable ages appeared (e.g. ages notably deviating from ages of other collections from the same formation), they were either revised or excluded. These data are an up–to–date record of North American dinosaur faunas and therefore incorporate new Late Cretaceous fossils discovered over the past few years. Overall, our pruned dataset comprised 43 dinosaur genera, and consisted of 11 formations across the WIB of North America and 17 well–sampled locations across the Hell Creek landscape.


Palaeoclimatic general circulation model. In this study, we used palaeoclimatic model outputs (here, near-surface [1.5 m] mean annual temperature (TempMean), near-surface [1.5 m] annual temperature standard deviation (TempSDann), annual average precipitation (PrecMean) and annual precipitation standard deviation (PrecSDann)) from the fully coupled atmosphere-ocean GCM HadCM3L v. 4.5 Atmospheric–Ocean General Circulation Model (Valdes et al. 2017). More specifically, we followed the nomenclature of Valdes et al. (2017) and applied the HadCM3BL–M2.1aE version of the model. The conditions of the model simulations for the Maastrichtian consist of an atmospheric CO2 concentration of 1120 ppmv, which is within the range of uncertainty provided by the recent proxy pCO2 reconstructions of Foster et al. (2017). The model simulations were run for a total of 1422 years, and the climate variables used in our analyses were an annual average of the last 30 years of these simulations. HadCM3L has contributed to the Coupled Mode Intercomparison Project experiments demonstrating skill when it comes to reproducing present-day climates (Collins et al. 2001; Valdes et al. 2017) and has also been used for an array of different palaeoclimate evaluations during the Eocene (Lunt et al. 2012), the Oligocene (Li et al. 2018) and the Miocene (Bradshaw et al. 2012). Detailed information on this palaeoclimatic model, including large–scale circulation (and associated energy and momentum fluxes) and temporal fluctuations, as well as the impacts of fine-scale orographic features on climate signals, are available elsewhere (e.g. Lunt et al. 2016; Chiarenza et al. 2019).

Palaeogeographical digital elevation models (DEMs). The Maastrichtian palaeogeography for this study is that of Scotese & Wright (2018), which has been compiled as a palaeo-digital elevation model to facilitate grid-based analyses. In brief, these maps were created from publicly available stratigraphic literature, supplemented by fieldwork, including lithology, palaeoenvironmental information and broad-scale facies identification. For large–scale analyses, these palaeogeographies were upscaled to the palaeoclimatic model resolution (3.75° x 2.5°). This means that topographic and bathymetric information was broadly conserved, as it was resolved at a lower resolution (see Chiarenza et al. 2019 for a similar approach).

Functional features. Each dinosaur taxon was classified into several functional guilds based on body mass (very small, small, medium, large and very large), locomotor mode (bipeds, facultative bipeds –capable of both quadrupedal and bipedal motion– and quadrupeds) and trophic habits (carnivores, omnivores and herbivores, and for the latter, low and high browsers).

Body mass is perhaps the single most important and meaningful functional trait for animals, as it ultimately affects many aspects of their biology including metabolic rates, mechanical constraints, ecological performance and lifestyle strategies related to feeding, locomotion and reproduction (Loeuille & Loreau 2006; Iossa et al. 2008). Here, we used body mass estimates (very small ≤ 10 kg; 10 kg < small  100 kg; 100 kg < medium  1000 kg; 1000 kg < large  10000 kg; very large > 10000 kg; Noto & Grossman 2010) based on adult representatives from the comprehensive dataset of Benson et al. (2014), which provides a wide list of dinosaur taxa using the scaling relationship of limb bone robustness (stylopodial circumference; Campione & Evans 2012). To obtain a more comprehensive understanding of body mass distributions in our data, we further applied an inflection point criterion based on the Barry & Hartigan (1993) product partition model with Markov chain Monte Carlo (MCMC). More specifically, this algorithm used the posterior probability of changes over 10000 MCMC iterations, excluding the first 1000 as burn in, to distinguish among different body mass categories in the latest Cretaceous dinosaurs of North America. Interestingly, this Bayesian analysis roughly identified most of the original body mass categories used in our study, with each category broadly representing an order of magnitude (GarcíaGirón et al. 2020b, appendix S1, fig. S1).

Trophic habits refer to the food processing strategies and diet of an animal, and it generally includes three primary categories, i.e. carnivores, herbivores and omnivores. Further subdivisions depend on the biological knowledge of the morphology (e.g. teeth morphology and skull) and behaviour of the study organismal group. Here, we assigned herbivores to categories of browse height rather than plant type due to the virtually unknown nature of plant preferences in dinosaurs. More specifically, we roughly assigned a simple maximum browsing limit (low  2 m; high > 2 m) based on characters such as limb length and neck posture using Noto & Grossman (2010) and Mallon et al. (2013).

We further divided locomotor mode into two major categories: quadrupeds and bipeds. For those taxa with intermediate axial and limb morphologies in proportions between those of bipeds and obligate quadrupeds (e.g. Hadrosauridae), we included an additional locomotor division, i.e. facultative bipeds (see Noto & Grossman, 2010 for a similar approach). For the following analyses, we applied the mixed–variables coefficient of distance (i.e. a generalisation of Gower’s distance; Pavoine et al. 2009) to extract a functional distance matrix, which described the functional differences between all taxon pairs based on body mass, trophic habits and locomotor mode (e.g. Heino & Tolonen 2017). Thereafter, the pairwise output values for the functional distance matrix were synthesised into separate axes using principal coordinate analysis (PCO) and following Duarte et al. (2012).

See the main text for References.

Usage notes

Additional Supporting files include the following Appendices:

Appendix S1. Body mass distributions based on product partition models with Markov sampling computations.

            Appendix S2. Functional and phylogenetic features for each spatial scale and study clade.

            Appendix S3. R packages and statistical routines.

            Appendix S4. Elements of metacommunity structure for the conservative fixed–fixed null model.

            Appendix S5. Results for the forward selection of explanatory variables.

            Appendix S6. Results for ordinary least squares (OLS) regression models.

            Appendix S7. Results for commonality analysis (CA) for each spatial scale and study clade.

            Appendix S8. Measuring the spatial autocorrelation of OLS model residuals.

The Excel file includes occurrence data, palaeoenvironmental reconstructions, and functional features:

Sheets 1 and 2 contain raw information on each study site for the Hell Creek and other North American geological formations, respectively.

Sheet 1 includes palaeoenvironmental information for the Hell Creek Formation (i.e. lithofacies -C, channel; FP, floodplain- and palaeotopography -m.a.s.l. after log-transformation). Raw PalaeoDEM data (Scotese and Wright, 2018) are also available here:

Sheet 2 contains raw information on the log-transformed palaeoenvironmental reconstructions for the Maastrichtian of North America (Palaeotopography -m.a.s.l., TempMean and TempSDann in K; Prec and PrecSDann in kgm-2). Raw palaeoclimate GCMs (Valdés et al., 2017) can also be obtained here:

Sheet 3 includes a taxon-specific classification into several functional guilds (see the main text for details):

These files may be opened and edited in Excel. 
For details or further queries, please contact Jorge García-Girón (


University of León, Award: 2017

Spanish Ministry of Economy and Industry, Award: CGL2017–84176R

Junta de Castilla y León, Award: LE004G18

Academy of Finland, Award: 331957

Academy of Finland, Award: 322652

European Research Council Starting Grant, Award: ERC StG 2017, 756226, PalM

University of León, Award: 2017

Spanish Ministry of Economy and Industry, Award: CGL2017–84176R

European Research Council Starting Grant, Award: ERC StG 2017, 756226, PalM