Data from: Quantifying size-related biases in the preservation and description of Cretaceous and Paleogene gastropods
Data files
May 22, 2026 version files 3.64 MB
-
Excellent_Preservation.csv
610.22 KB
-
Fair_Preservation.csv
131.55 KB
-
Final_Dataset.csv
1.99 MB
-
Formation_Data.csv
22.83 KB
-
Good_Preservation.csv
318.37 KB
-
indets.csv
421.94 KB
-
Poor_Preservation.csv
108.68 KB
-
README.md
34.04 KB
Abstract
Diversification of benthic mollusks in the Mesozoic and Cenozoic was a key component in the evolution of the modern marine fauna, but the effects of preservational biases on the apparent magnitude of diversification are not fully quantified. Specifically, the taxonomic radiation coincides with increased frequency of unlithified sedimentary deposits and increased quality of aragonitic fossil preservation, trends that could magnify the appearance of diversification by improving fossil recovery. Given that both lithification and aragonite dissolution are expected to preferentially hinder the recovery of small fossils, we explore the relationship between preservational quality and body size in Cretaceous and Paleogene gastropods. We measured shell size for over 7000 species occurrences illustrated in over 200 published sources; the literature-based approach allowed broad geographic and taxonomic sampling, although quality of preservation could not be assessed in as much detail as in field- and specimen-based studies. Three aspects of preservational quality were assessed for each measured specimen: overall shell completeness, surface pristineness, and features related to lithification and shell dissolution. We find that shells < 9–14 mm are underrepresented in the paleontological literature as preservational quality declines. An improvement in either the lithification or pristineness metric was associated with a decrease in mean shell size. Average preservational quality in sampled specimens increased through time, notably between the Cretaceous and Paleogene. These results support the contention that taphonomic biases affect observed biodiversity trends in gastropods by obscuring small taxa, but also suggest that the magnitude of this effect may be quantifiable in future studies.
Dataset DOI: 10.5061/dryad.t76hdr8gd
Description of the data and file structure
The Final_Dataset.csv file consists of all 7836 gastropod specimens measured for this study, and contains information on their taxonomy, location, the length and width of both the shell and aperture, and preservation. This allowed for us to analyze the relationship between body size and preservational quality in gastropods from the Cretaceous and Paleogene.
The indets.csv file is a subdivision of the full dataset, with all 1475 specimens that had been given an indeterminate species assignment by the original source. This allowed for us to see how preservational quality was associated with higher or lower percentages of uncertain species identifications (Figure 10).
The Excellent_Preservation.csv file is a subdivision of the full dataset, with all 2449 specimens with "excellent" preservational quality (e.g., C3L3P3).
The Good_Preservation.csv file is a subdivision of the full dataset, with all 1309 specimens with "good" preservational quality (e.g., C3L3P2 and C3L2P3).
The Fair_Preservation.csv file is a subdivision of the full dataset, with all 532 specimens with "fair" preservational quality (e.g., C3L3P1 and C3L2P2).
The Poor_Preservation.csv file is a subdivision of the full dataset, with all 433 specimens with "poor" preservational quality (e.g., C3L2P1, C3L1P1, C3L1P2, and C3L1P3).
The Formation_Data.csv file consists of all 459 formations included in Final_Dataset.csv, along with the formation's country, age, the number of specimens in the dataset that come from that formation, and the formation's approximate location in latitude/longitude. This allowed us to create maps (Figure 2) showing the geographic distribution of specimens in our dataset.
Files and variables
File: Final_Dataset.csv
Description: Our final dataset of all measured specimens, containing information on their taxonomy, location, the length and width of both the shell and aperture, and information on preservation.
Variables
- REFERENCE: The (simple author and year) citation of the source from which the row of data was collected.
- PLATE_#: The plate number in the reference that contains the specimen being included in the dataset.
- FIGURE_#: The specific figure number of the specimen in the plate.
- GENUS: The genus name of the specimen, as listed in the original source.
- SUBGENUS: The subgenus name of the specimen if given, as listed in the original source.
- SPECIES: The species name of the specimen, as listed in the original source.
- SUBCLASS: The subclass name of the specimen, as listed in the original source.
- SUBCLASS_(STANDARDIZED): The subclass name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUPERORDER: The superorder name of the specimen, as listed in the original source.
- ORDER: The order name of the specimen, as listed in the original source.
- SUBORDER: The suborder name of the specimen, as listed in the original source.
- SUPERFAMILY: The superfamily name of the specimen, as listed in the original source.
- SUPERFAMILY_(STANDARDIZED): The superfamily name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- FAMILY: The family name of the specimen, as listed in the original source.
- FAMILY_(STANDARDIZED): The family name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUBFAMILY: The subfamily name of the specimen, as listed in the original source.
- LOCATION: The specific collection site or location of the specimen, as listed in the original source.
- COUNTRY_OR_STATE: The country (or state, within the United States of America) where the specimen was collected.
- FORMATION: The name of the geologic formation where the specimen was collected, as listed in the original source.
- APPROXIMATE_AGE: The age of the specimen as listed in the original source.
- SIMPLIFIED_AGE: The age of the specimen, grouped to fall into Early Cretaceous, Late Cretaceous, Paleocene, Eocene, or Oligocene (rather than any specific subdivisions of those epochs).
- LENGTH_(mm): The length of the specimen from the tip of the spire to the base (please see Figure 3 for a visual depiction of how this measurement was taken).
- MAX.WIDTH(mm): The maximum width of the specimen, taken from an apertural view (please see Figure 3 for a visual depiction of how this measurement was taken).
- GEOM_MEAN (mm): The geometric mean of the length and maximum width measurements for a specimen.
- APERTURE_LENGTH_INCLUDING_CANAL_(mm): The length of the specimen's aperture (if visible), from the top of the aperture to the bottom of the siphonal canal.
- APERTURE_WIDTH_(mm): The width of the specimen's aperture (if visible), from side to side from an apertural view.
- TYPE_OF_FOSSIL: Whether the specimen was an internal mold (IM), external mold (EM), or not a mold (NM).
- TYPE_OF_FOSSIL_2: More simply, whether the specimen was a mold (MOLD), or not a mold (NM).
- APERTURE_FILLED: Whether or not the specimen's aperture was filled with sediment, with a simple yes (Y), no (N), or not applicable due to the aperture not being visible (NA).
- temporary1: Temporary column. Equation code =IF(AB="MOLD",1,2). The number of the specimen's row appears beside AB within the equation; ex AB2 for the specimen in row 2. If the specimen is a mold it is assigned a value of 1 for this column, otherwise a value of 2.
- temporary2: Temporary column. Equation code =IF(AC="N",1,0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture if not filled it is assigned a value of 1 for this column, otherwise a value of 0.
- temporary3: Temporary column. Equation code =IF(AC="NA",NA(),0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture fill status is considered "NA", NA() is entered into this column, otherwise a value of 0.
- LITHIFICATION: The lithification score of the specimen, on a scale from 1 (the worst) to 3 (the best). Equation code =SUM(AD:AF), where the number of the specimen's row appears beside AD and AF within the equation. This is the sum of the values in the three temporary columns.
- PRISTINENESS: The surface pristineness score of the specimen, on a scale from 1 (the worst) to 3 (the best).
- COMPLETENESS: The overall shell completeness of the specimen, on a scale from 1 (the worst) to 3 (the best).
- HOLOTYPE: Whether the specimen is a type specimen of any sort, and if so, which category it falls into.
- NOTES: Any other significant notes about the specimen, as applicable.
File: indets.csv
Description: A subdivision of our full dataset, comprised only of specimens that had been given an indeterminate species assignment by the original source.
Variables
- REFERENCE: The (simple author and year) citation of the source from which the row of data was collected.
- PLATE_#: The plate number in the reference that contains the specimen being included in the dataset.
- FIGURE_#: The specific figure number of the specimen in the plate.
- GENUS: The genus name of the specimen, as listed in the original source.
- SUBGENUS: The subgenus name of the specimen if given, as listed in the original source.
- SPECIES: The species name of the specimen, as listed in the original source.
- SUBCLASS: The subclass name of the specimen, as listed in the original source.
- SUBCLASS_(STANDARDIZED): The subclass name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUPERORDER: The superorder name of the specimen, as listed in the original source.
- ORDER: The order name of the specimen, as listed in the original source.
- SUBORDER: The suborder name of the specimen, as listed in the original source.
- SUPERFAMILY: The superfamily name of the specimen, as listed in the original source.
- SUPERFAMILY_(STANDARDIZED): The superfamily name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- FAMILY: The family name of the specimen, as listed in the original source.
- FAMILY_(STANDARDIZED): The family name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUBFAMILY: The subfamily name of the specimen, as listed in the original source.
- LOCATION: The specific collection site or location of the specimen, as listed in the original source.
- COUNTRY_OR_STATE: The country (or state, within the United States of America) where the specimen was collected.
- FORMATION: The name of the geologic formation where the specimen was collected, as listed in the original source.
- APPROXIMATE_AGE: The age of the specimen as listed in the original source.
- SIMPLIFIED_AGE: The age of the specimen, grouped to fall into Early Cretaceous, Late Cretaceous, Paleocene, Eocene, or Oligocene (rather than any specific subdivisions of those epochs).
- LENGTH_(mm): The length of the specimen from the tip of the spire to the base (please see Figure 3 for a visual depiction of how this measurement was taken).
- MAX.WIDTH(mm): The maximum width of the specimen, taken from an apertural view (please see Figure 3 for a visual depiction of how this measurement was taken).
- GEOM_MEAN (mm): The geometric mean of the length and maximum width measurements for a specimen.
- APERTURE_LENGTH_INCLUDING_CANAL_(mm): The length of the specimen's aperture (if visible), from the top of the aperture to the bottom of the siphonal canal.
- APERTURE_WIDTH_(mm): The width of the specimen's aperture (if visible), from side to side from an apertural view.
- TYPE_OF_FOSSIL: Whether the specimen was an internal mold (IM), external mold (EM), or not a mold (NM).
- TYPE_OF_FOSSIL_2: More simply, whether the specimen was a mold (MOLD), or not a mold (NM).
- APERTURE_FILLED: Whether or not the specimen's aperture was filled with sediment, with a simple yes (Y), no (N), or not applicable due to the aperture not being visible (NA).
- temporary1: Temporary column. Equation code =IF(AB="MOLD",1,2). The number of the specimen's row appears beside AB within the equation; ex AB2 for the specimen in row 2. If the specimen is a mold it is assigned a value of 1 for this column, otherwise a value of 2.
- temporary2: Temporary column. Equation code =IF(AC="N",1,0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture if not filled it is assigned a value of 1 for this column, otherwise a value of 0.
- temporary3: Temporary column. Equation code =IF(AC="NA",NA(),0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture fill status is considered "NA", NA() is entered into this column, otherwise a value of 0.
- LITHIFICATION: The lithification score of the specimen, on a scale from 1 (the worst) to 3 (the best). Equation code =SUM(AD:AF), where the number of the specimen's row appears beside AD and AF within the equation. This is the sum of the values in the three temporary columns.
- PRISTINENESS: The surface pristineness score of the specimen, on a scale from 1 (the worst) to 3 (the best).
- COMPLETENESS: The overall shell completeness of the specimen, on a scale from 1 (the worst) to 3 (the best).
- HOLOTYPE: Whether the specimen is a type specimen of any sort, and if so, which category it falls into.
- NOTES: Any other significant notes about the specimen, as applicable.
File: Excellent_Preservation.csv
Description: A subdivision of our full dataset, comprised only of specimens which we determined had excellent preservation (i.e., were C3L3P3).
Variables
- REFERENCE: The (simple author and year) citation of the source from which the row of data was collected.
- PLATE_#: The plate number in the reference that contains the specimen being included in the dataset.
- FIGURE_#: The specific figure number of the specimen in the plate.
- GENUS: The genus name of the specimen, as listed in the original source.
- SUBGENUS: The subgenus name of the specimen if given, as listed in the original source.
- SPECIES: The species name of the specimen, as listed in the original source.
- SUBCLASS: The subclass name of the specimen, as listed in the original source.
- SUBCLASS_(STANDARDIZED): The subclass name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUPERORDER: The superorder name of the specimen, as listed in the original source.
- ORDER: The order name of the specimen, as listed in the original source.
- SUBORDER: The suborder name of the specimen, as listed in the original source.
- SUPERFAMILY: The superfamily name of the specimen, as listed in the original source.
- SUPERFAMILY_(STANDARDIZED): The superfamily name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- FAMILY: The family name of the specimen, as listed in the original source.
- FAMILY_(STANDARDIZED): The family name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUBFAMILY: The subfamily name of the specimen, as listed in the original source.
- LOCATION: The specific collection site or location of the specimen, as listed in the original source.
- COUNTRY_OR_STATE: The country (or state, within the United States of America) where the specimen was collected.
- FORMATION: The name of the geologic formation where the specimen was collected, as listed in the original source.
- APPROXIMATE_AGE: The age of the specimen as listed in the original source.
- SIMPLIFIED_AGE: The age of the specimen, grouped to fall into Early Cretaceous, Late Cretaceous, Paleocene, Eocene, or Oligocene (rather than any specific subdivisions of those epochs).
- LENGTH_(mm): The length of the specimen from the tip of the spire to the base (please see Figure 3 for a visual depiction of how this measurement was taken).
- MAX.WIDTH(mm): The maximum width of the specimen, taken from an apertural view (please see Figure 3 for a visual depiction of how this measurement was taken).
- GEOM_MEAN (mm): The geometric mean of the length and maximum width measurements for a specimen.
- APERTURE_LENGTH_INCLUDING_CANAL_(mm): The length of the specimen's aperture (if visible), from the top of the aperture to the bottom of the siphonal canal.
- APERTURE_WIDTH_(mm): The width of the specimen's aperture (if visible), from side to side from an apertural view.
- TYPE_OF_FOSSIL: Whether the specimen was an internal mold (IM), external mold (EM), or not a mold (NM).
- TYPE_OF_FOSSIL_2: More simply, whether the specimen was a mold (MOLD), or not a mold (NM).
- APERTURE_FILLED: Whether or not the specimen's aperture was filled with sediment, with a simple yes (Y), no (N), or not applicable due to the aperture not being visible (NA).
- temporary1: Temporary column. Equation code =IF(AB="MOLD",1,2). The number of the specimen's row appears beside AB within the equation; ex AB2 for the specimen in row 2. If the specimen is a mold it is assigned a value of 1 for this column, otherwise a value of 2.
- temporary2: Temporary column. Equation code =IF(AC="N",1,0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture if not filled it is assigned a value of 1 for this column, otherwise a value of 0.
- temporary3: Temporary column. Equation code =IF(AC="NA",NA(),0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture fill status is considered "NA", NA() is entered into this column, otherwise a value of 0.
- LITHIFICATION: The lithification score of the specimen, on a scale from 1 (the worst) to 3 (the best). Equation code =SUM(AD:AF), where the number of the specimen's row appears beside AD and AF within the equation. This is the sum of the values in the three temporary columns.
- PRISTINENESS: The surface pristineness score of the specimen, on a scale from 1 (the worst) to 3 (the best).
- COMPLETENESS: The overall shell completeness of the specimen, on a scale from 1 (the worst) to 3 (the best).
- HOLOTYPE: Whether the specimen is a type specimen of any sort, and if so, which category it falls into.
- NOTES: Any other significant notes about the specimen, as applicable.
File: Good_Preservation.csv
Description: A subdivision of our full dataset, comprised only of specimens which we determined had good preservation (i.e., were C3L3P2 or C3L2P3).
Variables
- REFERENCE: The (simple author and year) citation of the source from which the row of data was collected.
- PLATE_#: The plate number in the reference that contains the specimen being included in the dataset.
- FIGURE_#: The specific figure number of the specimen in the plate.
- GENUS: The genus name of the specimen, as listed in the original source.
- SUBGENUS: The subgenus name of the specimen if given, as listed in the original source.
- SPECIES: The species name of the specimen, as listed in the original source.
- SUBCLASS: The subclass name of the specimen, as listed in the original source.
- SUBCLASS_(STANDARDIZED): The subclass name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUPERORDER: The superorder name of the specimen, as listed in the original source.
- ORDER: The order name of the specimen, as listed in the original source.
- SUBORDER: The suborder name of the specimen, as listed in the original source.
- SUPERFAMILY: The superfamily name of the specimen, as listed in the original source.
- SUPERFAMILY_(STANDARDIZED): The superfamily name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- FAMILY: The family name of the specimen, as listed in the original source.
- FAMILY_(STANDARDIZED): The family name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUBFAMILY: The subfamily name of the specimen, as listed in the original source.
- LOCATION: The specific collection site or location of the specimen, as listed in the original source.
- COUNTRY_OR_STATE: The country (or state, within the United States of America) where the specimen was collected.
- FORMATION: The name of the geologic formation where the specimen was collected, as listed in the original source.
- APPROXIMATE_AGE: The age of the specimen as listed in the original source.
- SIMPLIFIED_AGE: The age of the specimen, grouped to fall into Early Cretaceous, Late Cretaceous, Paleocene, Eocene, or Oligocene (rather than any specific subdivisions of those epochs).
- LENGTH_(mm): The length of the specimen from the tip of the spire to the base (please see Figure 3 for a visual depiction of how this measurement was taken).
- MAX.WIDTH(mm): The maximum width of the specimen, taken from an apertural view (please see Figure 3 for a visual depiction of how this measurement was taken).
- GEOM_MEAN (mm): The geometric mean of the length and maximum width measurements for a specimen.
- APERTURE_LENGTH_INCLUDING_CANAL_(mm): The length of the specimen's aperture (if visible), from the top of the aperture to the bottom of the siphonal canal.
- APERTURE_WIDTH_(mm): The width of the specimen's aperture (if visible), from side to side from an apertural view.
- TYPE_OF_FOSSIL: Whether the specimen was an internal mold (IM), external mold (EM), or not a mold (NM).
- TYPE_OF_FOSSIL_2: More simply, whether the specimen was a mold (MOLD), or not a mold (NM).
- APERTURE_FILLED: Whether or not the specimen's aperture was filled with sediment, with a simple yes (Y), no (N), or not applicable due to the aperture not being visible (NA).
- temporary1: Temporary column. Equation code =IF(AB="MOLD",1,2). The number of the specimen's row appears beside AB within the equation; ex AB2 for the specimen in row 2. If the specimen is a mold it is assigned a value of 1 for this column, otherwise a value of 2.
- temporary2: Temporary column. Equation code =IF(AC="N",1,0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture if not filled it is assigned a value of 1 for this column, otherwise a value of 0.
- temporary3: Temporary column. Equation code =IF(AC="NA",NA(),0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture fill status is considered "NA", NA() is entered into this column, otherwise a value of 0.
- LITHIFICATION: The lithification score of the specimen, on a scale from 1 (the worst) to 3 (the best). Equation code =SUM(AD:AF), where the number of the specimen's row appears beside AD and AF within the equation. This is the sum of the values in the three temporary columns.
- PRISTINENESS: The surface pristineness score of the specimen, on a scale from 1 (the worst) to 3 (the best).
- COMPLETENESS: The overall shell completeness of the specimen, on a scale from 1 (the worst) to 3 (the best).
- HOLOTYPE: Whether the specimen is a type specimen of any sort, and if so, which category it falls into.
- NOTES: Any other significant notes about the specimen, as applicable.
File: Fair_Preservation.csv
Description: A subdivision of our full dataset, comprised only of specimens which we determined had fair preservation (i.e., were C3L3P1 or C3L2P2).
Variables
- REFERENCE: The (simple author and year) citation of the source from which the row of data was collected.
- PLATE_#: The plate number in the reference that contains the specimen being included in the dataset.
- FIGURE_#: The specific figure number of the specimen in the plate.
- GENUS: The genus name of the specimen, as listed in the original source.
- SUBGENUS: The subgenus name of the specimen if given, as listed in the original source.
- SPECIES: The species name of the specimen, as listed in the original source.
- SUBCLASS: The subclass name of the specimen, as listed in the original source.
- SUBCLASS_(STANDARDIZED): The subclass name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUPERORDER: The superorder name of the specimen, as listed in the original source.
- ORDER: The order name of the specimen, as listed in the original source.
- SUBORDER: The suborder name of the specimen, as listed in the original source.
- SUPERFAMILY: The superfamily name of the specimen, as listed in the original source.
- SUPERFAMILY_(STANDARDIZED): The superfamily name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- FAMILY: The family name of the specimen, as listed in the original source.
- FAMILY_(STANDARDIZED): The family name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUBFAMILY: The subfamily name of the specimen, as listed in the original source.
- LOCATION: The specific collection site or location of the specimen, as listed in the original source.
- COUNTRY_OR_STATE: The country (or state, within the United States of America) where the specimen was collected.
- FORMATION: The name of the geologic formation where the specimen was collected, as listed in the original source.
- APPROXIMATE_AGE: The age of the specimen as listed in the original source.
- SIMPLIFIED_AGE: The age of the specimen, grouped to fall into Early Cretaceous, Late Cretaceous, Paleocene, Eocene, or Oligocene (rather than any specific subdivisions of those epochs).
- LENGTH_(mm): The length of the specimen from the tip of the spire to the base (please see Figure 3 for a visual depiction of how this measurement was taken).
- MAX.WIDTH(mm): The maximum width of the specimen, taken from an apertural view (please see Figure 3 for a visual depiction of how this measurement was taken).
- GEOM_MEAN (mm): The geometric mean of the length and maximum width measurements for a specimen.
- APERTURE_LENGTH_INCLUDING_CANAL_(mm): The length of the specimen's aperture (if visible), from the top of the aperture to the bottom of the siphonal canal.
- APERTURE_WIDTH_(mm): The width of the specimen's aperture (if visible), from side to side from an apertural view.
- TYPE_OF_FOSSIL: Whether the specimen was an internal mold (IM), external mold (EM), or not a mold (NM).
- TYPE_OF_FOSSIL_2: More simply, whether the specimen was a mold (MOLD), or not a mold (NM).
- APERTURE_FILLED: Whether or not the specimen's aperture was filled with sediment, with a simple yes (Y), no (N), or not applicable due to the aperture not being visible (NA).
- temporary1: Temporary column. Equation code =IF(AB="MOLD",1,2). The number of the specimen's row appears beside AB within the equation; ex AB2 for the specimen in row 2. If the specimen is a mold it is assigned a value of 1 for this column, otherwise a value of 2.
- temporary2: Temporary column. Equation code =IF(AC="N",1,0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture if not filled it is assigned a value of 1 for this column, otherwise a value of 0.
- temporary3: Temporary column. Equation code =IF(AC="NA",NA(),0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture fill status is considered "NA", NA() is entered into this column, otherwise a value of 0.
- LITHIFICATION: The lithification score of the specimen, on a scale from 1 (the worst) to 3 (the best). Equation code =SUM(AD:AF), where the number of the specimen's row appears beside AD and AF within the equation. This is the sum of the values in the three temporary columns.
- PRISTINENESS: The surface pristineness score of the specimen, on a scale from 1 (the worst) to 3 (the best).
- COMPLETENESS: The overall shell completeness of the specimen, on a scale from 1 (the worst) to 3 (the best).
- HOLOTYPE: Whether the specimen is a type specimen of any sort, and if so, which category it falls into.
- NOTES: Any other significant notes about the specimen, as applicable.
File: Poor_Preservation.csv
Description: A subdivision of our full dataset, comprised only of specimens which we determined had poor preservation (i.e., were C3L2P1, C3L1P1, C3L1P2, or C3L1P3).
Variables
- REFERENCE: The (simple author and year) citation of the source from which the row of data was collected.
- PLATE_#: The plate number in the reference that contains the specimen being included in the dataset.
- FIGURE_#: The specific figure number of the specimen in the plate.
- GENUS: The genus name of the specimen, as listed in the original source.
- SUBGENUS: The subgenus name of the specimen if given, as listed in the original source.
- SPECIES: The species name of the specimen, as listed in the original source.
- SUBCLASS: The subclass name of the specimen, as listed in the original source.
- SUBCLASS_(STANDARDIZED): The subclass name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUPERORDER: The superorder name of the specimen, as listed in the original source.
- ORDER: The order name of the specimen, as listed in the original source.
- SUBORDER: The suborder name of the specimen, as listed in the original source.
- SUPERFAMILY: The superfamily name of the specimen, as listed in the original source.
- SUPERFAMILY_(STANDARDIZED): The superfamily name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- FAMILY: The family name of the specimen, as listed in the original source.
- FAMILY_(STANDARDIZED): The family name of the specimen, standardized to our modern, most up-to-date understanding of where the specimen falls within gastropod taxonomy.
- SUBFAMILY: The subfamily name of the specimen, as listed in the original source.
- LOCATION: The specific collection site or location of the specimen, as listed in the original source.
- COUNTRY_OR_STATE: The country (or state, within the United States of America) where the specimen was collected.
- FORMATION: The name of the geologic formation where the specimen was collected, as listed in the original source.
- APPROXIMATE_AGE: The age of the specimen as listed in the original source.
- SIMPLIFIED_AGE: The age of the specimen, grouped to fall into Early Cretaceous, Late Cretaceous, Paleocene, Eocene, or Oligocene (rather than any specific subdivisions of those epochs).
- LENGTH_(mm): The length of the specimen from the tip of the spire to the base (please see Figure 3 for a visual depiction of how this measurement was taken).
- MAX.WIDTH(mm): The maximum width of the specimen, taken from an apertural view (please see Figure 3 for a visual depiction of how this measurement was taken).
- GEOM_MEAN (mm): The geometric mean of the length and maximum width measurements for a specimen.
- APERTURE_LENGTH_INCLUDING_CANAL_(mm): The length of the specimen's aperture (if visible), from the top of the aperture to the bottom of the siphonal canal.
- APERTURE_WIDTH_(mm): The width of the specimen's aperture (if visible), from side to side from an apertural view.
- TYPE_OF_FOSSIL: Whether the specimen was an internal mold (IM), external mold (EM), or not a mold (NM).
- TYPE_OF_FOSSIL_2: More simply, whether the specimen was a mold (MOLD), or not a mold (NM).
- APERTURE_FILLED: Whether or not the specimen's aperture was filled with sediment, with a simple yes (Y), no (N), or not applicable due to the aperture not being visible (NA).
- temporary1: Temporary column. Equation code =IF(AB="MOLD",1,2). The number of the specimen's row appears beside AB within the equation; ex AB2 for the specimen in row 2. If the specimen is a mold it is assigned a value of 1 for this column, otherwise a value of 2.
- temporary2: Temporary column. Equation code =IF(AC="N",1,0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture if not filled it is assigned a value of 1 for this column, otherwise a value of 0.
- temporary3: Temporary column. Equation code =IF(AC="NA",NA(),0). The number of the specimen's row appears beside AC within the equation; ex AC2 for the specimen in row 2. If the specimen's aperture fill status is considered "NA", NA() is entered into this column, otherwise a value of 0.
- LITHIFICATION: The lithification score of the specimen, on a scale from 1 (the worst) to 3 (the best). Equation code =SUM(AD:AF), where the number of the specimen's row appears beside AD and AF within the equation. This is the sum of the values in the three temporary columns.
- PRISTINENESS: The surface pristineness score of the specimen, on a scale from 1 (the worst) to 3 (the best).
- COMPLETENESS: The overall shell completeness of the specimen, on a scale from 1 (the worst) to 3 (the best).
- HOLOTYPE: Whether the specimen is a type specimen of any sort, and if so, which category it falls into.
- NOTES: Any other significant notes about the specimen, as applicable.
File: Formation_Data.csv
Description: A list of each geologic formation in our dataset, their age, the number of samples coming from each, and their approximate locations.
Variables
- FORMATION: The name of each geologic formation in our dataset, as listed in the original source.
- COUNTRY: The country where the geologic formation is located.
- AGE: The age of the geologic formation, simplified as Early Cretaceous, Late Cretaceous, Paleocene, Eocene, or Oligocene.
- NUMBER_SAMPLES: The number of samples in our dataset listed as originating from that geologic formation.
- LATITUDE: The closest approximate latitude corresponding to that geologic formation.
- LONGITUDE: The closest approximate longitude corresponding to that geologic formation.
Code/software
All code for this paper is included in the file "Final_R_Code_2026.R". The version of R used was 2023.06.0+421. Loaded packages were tidyverse, ggplot, and viridis. Most files are reproduced within the R code; however, for Figures 4 and 10, the datasets were too long. For Figure 4, please set the path to the file "Final_Dataset.csv" using the setwd() function where indicated. For Figure 10, please set the path to the file "indets.csv" using the setwd() function where indicated, and set the path to the file "Final_Dataset.csv" using the setwd() function where indicated.
Our datasets are easily viewable in Microsoft Excel or Google Sheets.
Access information
Other publicly accessible locations of the data:
- N/A
Data was derived from the following sources:
- Please see the supplemental file "Supplemental_Data_Source_Literature.docx" for a full list, as our data was taken based on figures in over 200 sources.
