Data from: The paleobiologic implications of modern nonmarine ecological gradients

Holland, Steven 1

Research facility: University of Georgia

Published May 17, 2024 on Dryad. https://doi.org/10.5061/dryad.pc866t1z7

Data files

May 17, 2024 version files 6.94 KB

coastalPlainSEUS.csv

1.88 KB
README.md

5.06 KB

Abstract

In modern nonmarine settings, previous studies have demonstrated the importance of elevation-correlated ecological gradients, but such studies tend to focus on relatively small areas and only one higher taxon. Here, we analyze GBIF occurrence records from a wide variety of taxa across the southeastern United States coastal plain. Many taxa display ecological gradients (gradients in proportional or relative abundance) correlated with elevation, distance to the coast, and latitude. These gradients tend to be steepest within a few tens of kilometers near the coast and at elevations less than 25 m. Some taxa, notably terrestrial mammals, do not display gradients correlated with elevation and distance to the coast. The small sample sizes of these groups and their heterogeneous sampling raise concerns about whether sufficient data exists. Coupled with previous studies of these ecological gradients, their common presence over distances of tens to hundreds of kilometers and elevations of tens to hundreds of meters suggests they are likely important in the nonmarine fossil record. Because elevation and distance to the coast change predictably with cycles of accommodation and sediment flux, these ecological gradients are predicted to occur in the nonmarine stratigraphic record, especially through intervals that record transgression or regression. Such gradients will affect the local composition of species associations and occurrences, even in the absence of regional species origination, immigration, and extinction and in the absence of regional change in the structure of ecological gradients. The ordination of taxon counts in stratigraphically limited samples has great potential for establishing their existence.

https://doi.org/10.5061/dryad.pc866t1z7

Description of the data and file structure

We have submitted our R code for replicating the analyses of this paper. These scripts can be modified easily for use with other taxa in other regions. Four files of R scripts are included (workflow.R, taxa.R, coastalPlainGradients.R, and coastalPlain.R), and their use is described under Code/Software below. We have also supplied a .csv file containing the geographic coordinates of the study area.

We have also submitted supplemental figures (supplementalFigures.pdf) that could not be included in the manuscript owing to space considerations. The first five pages are scatterplots of NMS Axis 1 scores vs. latitude for every taxon that was analyzed, similar to Figures 3 and 4 in the manuscript. The last five pages are plots of NMS axis 1 scores plotted on 0.2° bins of latitude and longitude, similar to Figure 5 in the manuscript. The methods behind both sets of images are contained in the manuscript.

We also include a description (statisticalSignificance.pdf) of how the statistical significance of the correlations was evaluated.

coastalPlainSEUS.csv
This file contains the longitude and latitude of points bounding the study area, presented in a counterclockwise direction, as required for downloads from GBIF. The file contains only two columns of data:

longitude: longitude in decimal degrees; negative sign indicates coordinates in western hemisphere
latitude: latitude in decimal degrees

Sharing/Access information

Data downloaded and analyzed by these R scripts comes from the Global Biodiversity Information Facility (GBIF)

Code/Software

R is required to run the scripts in workflow.R, taxa.R, coastalPlainGradients.R, and coastalPlain.R. The scripts were written for version 4.1.3, but they also run on the most recent version of R as of this writing (4.4.0). All four files should be placed in the working directory for R. All code is provided under the Universal (CC0 1.0) Public Domain Dedication.

The workflow.R file contains the sequence of commands that are called to perform the analyses and make the plots. Taxon-specific modifications to two of these commands are contained in taxa.R. The polygon in coastalPlain.R delimits the coastal plain of Georgia, South Carolina, and North Carolina, necessary downloading the data from GBIF. workflow.R is the principal file used for performing the analyses, and it contains comments describing the purpose of each set of commands. The analyses require several R packages listed at the top of coastalPlainGradients.R with the reasons they are needed.

To replicate the analysis of any particular group, find the values of taxon and filePrefix declared in taxa.R. Substitute these for lines 14–15 in workflow.R, then run the workflow code. Nothing needs to be changed in workflow.R other than these two lines.

The script will produce six output files. Five files begin with the name of the taxon. Using an analysis of the Pinopsida as an example, the workflow.R script will create the following files: Pinopsida.txt, PinopsidaDistance.pdf, PinopsidaElevation.pdf, PinopsidaLatitude.pdf, and PinopsidaMap.pdf. Pinopsida.txt is a log file of the analysis. PinopsidaDistance.pdf, PinopsidaElevation.pdf, PinopsidaLatitude.pdf are scatterplots of NMS 1 scores vs. distance to the coast (in km), elevation (in m above sea level), and latitude. PinopsidaMap.pdf is the map of NMS 1 scores; all maps are included in the supplemental information. The initial name of these files (e.g., Pinopsida in this example) is set by the value of the taxon object obtained from the taxa.R file and declared on line 14 of workflow.R.

The sixth file created from the workflow.R file is a .zip file containing the downloaded GBIF data. You can use this file to avoid redownloading the data if you wish to repeat an analysis or conduct new analyses. Uncompress this .zip file to create a .csv file with the data; you might find it helpful to replace the name with the name of your taxon. To use this data instead of redownloading the data in workflow.R, substitute line 19 of workflow.R with the following command (replacing the name of the .csv file to whatever you named it):

gbif <- read.table("pinopsida.csv", header=TRUE, row.names=1, sep=",")

In some cases, this .csv file may be malformed, requiring tedious manual cleanup. I have generally found it easier to redownload the file than to save it and reuse it.

Downloading occurrences from GBIF via R

In developing these analyses, we encountered some difficulties in downloading data from GBIF, and we found two resources particularly helpful: Jon Waller's post on Getting Occurrence Data from GBIF and the API for occurrence downloads from GBIF.