Skip to main content

Data for: Insights for modern invasion ecology from biotic changes of the Clarksville Phase of the Richmondian Invasion (Ordovician, Katian)

Cite this dataset

Stigall, Alycia; Forsythe, Ian (2022). Data for: Insights for modern invasion ecology from biotic changes of the Clarksville Phase of the Richmondian Invasion (Ordovician, Katian) [Dataset]. Dryad.


The frequency of biotic invasions in modern ecosystems is increasing due to global trade moving taxa outside their native ranges and climate change facilitating establishment of taxa in previously inhospitable regions. Thus, developing a holistic understanding of biotic invasions and how they impact ecosystems over different timescales—from annual to geologic time scales—is vital. Herein we examine a geologically brief invasion event, the Clarksville Phase of the Richmondian Invasion. Prior analyses have established general ecological and evolutionary patterns across the entire Richmondian Invasion, but recent sequence stratigraphic refinement makes analysis of individual invasion pulses possible for the first time. We examine biotic change across the Clarksville Phase and identify invasion impacts on diversity, paleocommunity composition, and niche stability. Invader arrival and success were strongly linked to increased propagule pressure facilitated by sea level changes. Invaders initially colonized deep subtidal environments and then moved offshore facilitated by rapid niche evolution during the invasion interval. Invasive taxa that attained the largest population sizes belonged to previously underutilized ecological guilds. Overall, the introduction of the invasive taxa resulted in increased diversity that was maintained into the post-invasion interval accompanied by a change in community composition in which the invaders became dominant paleocommunity members. Combined these analyses document a biotic invasion facilitated by climate change which increased local diversity through invaders occupying underutilized ecospace and competition-related niche contraction on millennial time scales. Developing a long-term perspective to accompany shorter-term studies facilitates predicting the long-term impacts of modern invasions and creating better-informed policies and practices. 


Data Collection

To quantify how paleocommunity structure changed across the Clarksville Phase of the Richmondian Invasion, paleocommunity parameters were reconstructed from genus-level count data of fossil taxa collected from bedding planes using quadrat sampling. Sampling was conducted at multiple stratigraphic horizons during and after the Clarksville Pulse of the Richmondian Invasion, across a range of depositional environments within the Cincinnati Basin. In this study, a paleocommunity or community is defined as a generalized group of taxa that characterize a particular environment and may represent a segment of a biotic gradient and is synonymous with the biofacies of Brett et al. (2007). Paleocommunities represent the preserved biota and not the complete set of organisms (the ecological community) present during deposition.

The primary data collected for this study are counts of macrofaunal invertebrate fossils exposed on bedding plane surfaces. Fossil count data were collected from four submembers of the Waynesville Formation using the stratigraphic framework of Brett et al. (2020) (Fig. 2). The early invasion data were collected from the Bon Well Hill Submember (transgressive systems tract of the C5B sequence), lag phase data were collected from the Harpers Run Submember (highstand to falling stage systems tract of the C5B sequence), main invasion data were collected from the Stony Hollow Creek Submember (transgressive systems tract of the C5C sequence), and post-invasion data were collected from the Middle Clarksville Submember (highstand to falling stage systems tract of the C5C sequence). Based on the average timing of 4th-order sequences and glacial cycles, the interval considered in this study represents no more than 200,000 to 400,000 years (Brett et al. 2020). Because the invasion is recorded primarily in the lower beds of the transgressive systems tract of C5C sequence, the main invasion pulse at the base of the Stony Hollow Creek Submember likely records less than a few thousand years (Brett et al. 2020). 

Geographic coverage was established by collecting data from 11 localities (Supplementary Tables 1, 2) along a 187-kilometer northwest-to-southeast trending transect. Localities were selected to maximize exposure of the study interval and include the maximum coverage of depositional settings. Sites representing shoal, shallow subtidal, deep subtidal, and offshore environments were selected so that the impact of the invaders in different depositional settings could be quantified. In total, 92 individual beds were sampled with a cumulative sample area of 56.6 m2. Sampled area for each bed was strongly controlled by exposure and accessibility of the layer and ranged from 100 cm2 to 6300 cm2.

Faunal count data were collected from bedding planes using a 100 cm2 quadrant. Data collection focused on obtaining census data from limestone layers rather than mudstone units because limestone units represent a time-averaged faunal assemblage and have been shown to record a more complete census of alpha diversity in comparable Ordovician strata (Finnegan and Droser 2008). During sampling, bryozoans were identified by colony morphology (massive, encrusting, thin ramose, thick ramose, thin bifoliate) and all other taxa were identified to the genus level. The following elements were counted as individuals: single valves, or steinkerns of mollusks (e.g., gastropods, bivalves) more than 50% complete, pygidia, cranidia, and hypostomes of trilobites, single valves of brachiopods more than 50% complete, solitary rugose corals, and each centimeter of coralline algae. Bryozoans were classified based on zoarium morphology using the classification scheme of Holland et al. (2001) and every 1 cm of length for thick ramose (> 5 mm), thin ramose(< 5 mm), encrusting, massive, and thin bifoliate (< 5 mm) bryozoan was counted as one individual (Patzkowsky and Holland 1999). Relative abundance of crinoid columnals was tallied with 1–5 columnals counted as one individual, 6-10 columnals counted as two individuals, etc. The skeletal-element-to-individual ratios employed for each taxon were optimized to produce the strongest data signal and to prevent a single clade, such as bryozoa, from dominating and obscuring the signal from the other taxa. The same correction factors apply to all parts of the study interval equally making data for each time slice directly comparable (Finnegan and Droser 2008). 

Data Processing

Sampling resulted in a raw dataset of 14,574 tabulated individuals belonging to 37 taxa. Minimum number of individuals (MNI) (Gilinsky and Bennington 1994) was then calculated for brachiopods, bivalves, and trilobites to ensure that the abundances of these taxa were not overestimated. The counts for thick ramose, thin ramose, and thin bifoliate bryozoans were then divided by 10 to capture differences in zoarium size between these forms and encrusting and massive morphologies and prevent them from dominating and obscuring the signal from the other taxa. Once this was complete, all samples collected from the same site and layer were pooled to increase signal strength. This resulted in a final dataset of 5,231 tabulated individuals belonging to 37 taxa (Supplementary Table 3). 

Before analyses were conducted, the data were separated into four discrete taxon x sample matrices corresponding to the four sampled submembers. Prior to multivariate analyses, all samples containing only one taxon and all taxa occurring in only one sample were removed from each of the four datasets to prevent distortion following Holland and Patzkowsky (2007). The full dataset was used without culling for calculating measures of biodiversity. Data were subjected to percent and maximum value transformations which were conducted in the vegan R package using the decostand function (Oksanen et al. 2020). Percent transformation, in which each value in a row is transformed to a percentage of the row total, was conducted to prevent multivariate analyses from being dominated by variations in sample size (McCune et al. 2002). Maximum transformation, which converts each value in a column to a percentage of the maximum value in that column, was conducted to give all the taxa the same weight in analyses (McCune et al. 2002). 


A set of analyses were conducted to address our hypotheses of how invaders impacted the recipient biota. Analyses include rarefaction to quantify changes in richness, Simpson’s index of dominance to test for changes in evenness (Twitchett et al. 2004; Wheeley and Twitchett 2005; Chen et al. 2007), cluster analysis to investigate changes in paleocommunity composition, modified indirect gradient analysis to investigate the distribution of taxa along the onshore-offshore gradient, comparison of environmental preferences and tolerances through time to test for shifts in habitat occupation, and guild analysis to investigate changes in ecospace utilization.

Diversity analysis.––Rarefied richness was calculated for each of the four submembers at a subsample size of 700 individuals to test for changes in richness within the study interval while accounting for differences in sample size. Rarefied richness was calculated using the rarefy function in the vegan R package and rarefaction curves were plotted using the rarecurve function from the same package (Oksanen et al. 2020). Simpson’s Index of Diversity was calculated for each of the four submembers to test for changes in evenness and dominance through the study interval. Simpson’s Index of Diversity was calculated as the mean of 10,000 bootstrap replicates using the diversity_boot function in the poppr R package (Kamvar et al. 2015). 

Cluster analyses.—Q-mode and R-mode cluster analysis were conducted to reconstruct paleocommunity composition. The dissimilarity matrix for cluster analysis was calculated via the Bray-Curtis dissimilarity index using the vegdist function in the vegan R package (Oksanen et al. 2020). Cluster analysis was then conducted via Ward’s agglomeration method using the agnes function in the cluster R package (Maechler et al. 2013). Two-way cluster analysis was then conducted so that the abundance of taxa within each paleocommunity could be visualized. The two dendrograms that would be used were calculated using the hclust function in the vegan R package (Oksanen et al. 2020) and a heatmap was produced using the heatmap function in the stats package (R Core Team 2013). 

Modified Indirect Gradient Analysis.—Detrended correspondence analysis (DCA) was conducted to uncover the structure of the primary gradient, which is commonly water depth for ecological datasets collected from a marine environment (Holland et al. 2001). DCA was conducted in the R programming environment using the decorana function in the vegan R package with downweighting of rare taxa turned on (Oksanen et al. 2020). Nonmetric multidimensional scaling (NMDS) was also conducted for comparison. NMDS was conducted in the R programming environment using the metaMDS function in the vegan r package (Oksanen et al. 2020). The number of dimensions was set to two and the maximum number of random starts was set to 50. The Bray-Curtis dissimilarity index was used because it is an appropriate distance measure for analysis of multivariate ecological datasets (Beals 1984). Results for both ordination analyses are consistent, indicating strong signal within the data, and DCA results are reported herein. To test the hypothesis that axis one of the DCA plots represents water depth, resulting DCA axis one scores were extracted to create a quantitative depth map of the study area which could then be compared to basin bathymetry as it has been previously interpreted based on lithology, ichnology, and faunal assemblages. To do this, all the data collected from each site were pooled and a single DCA axis one score for each site was calculated. Coordinates of sites and their associated DCA axis one scores were imported into ArcGIS Pro and a continuous raster layer of DCA scores was produced using the inverse distance weighting method of interpolation. 

Niche stability analyses.—An analysis of taxon’s habitat preferences and tolerances throughout the study interval was conducted using the weighted averaging technique of Holland and Zaffos (2011) and the R script published therein. A detailed discussion of this method can be found in Holland and Zaffos (2011); only a brief overview is provided here. When DCA is conducted, axis one corresponds to the primary source of variation in the dataset which is commonly depth for marine ecological datasets (Holland et al. 2001). When axis one is confirmed to correspond to water depth, the axis one scores of sites and taxa can be used to estimate aspects of habitat occupation such as the preferred environment, environmental tolerance, and peak abundance of a taxon. The environmental tolerance of a taxon is calculated as the standard deviation of the DCA axis one scores for all samples in which the taxon occurs. The preferred environment of this taxon is the DCA axis one score of that taxon. These two parameters were calculated for each taxon in each of the fourth-order systems tracts in the study interval (Submembers). Through-time comparison of taxon scores was conducted via Pearson’s product moment correlation to assess temporal stability of habitat preferences. The calculations were conducted only for taxa with a peak abundance (probability of collecting that taxon in its preferred environment) of greater that 40% to ensure that the parameters were estimated as accurately as is possible with the current dataset. 

Guild analysis.—Guild analysis was conducted to examine patterns of ecospace utilization through space and time. Guild membership was assigned using tier, food source, mobility, and habitat utilization based on a literature review (Supplementary Table 4). A modified version of the tiering scheme of Watkins (1991) was used in which low tier corresponds to 0–6 cm above the substrate, mid-tier corresponds to 6–10 cm above the substrate, and high tier corresponds to >10 cm above the substrate. The counts of taxa assigned to each guild were tabulated for each paleocommunity recovered from cluster analysis in each of the four study intervals. 


Paleontological Society

Ohio University, Award: Geological Sciences Alumni Research Grant