Black death land abandonment drove European diversity losses
Abstract
The current prevailing perception is that human impacts on the biological realm have been overwhelmingly negative. Here, we test this narrative by considering the consequences for aspects of floristic diversity of the ‘Black Death era’ (1300–1400 CE), where one third of Europe’s population died within half a decade. Based on evidence from 109 pollen records spanning the Common Era, we find increasing floristic diversity from 0 CE to ~1300 CE as human populations increased, followed by rapid and substantial diversity reductions during the famine- and disease-driven human mortality events of the ‘Black Death era’. As human populations recovered following the mortality shock, diversity also recovered. This repository contains all the code used to investigate and answer these questions. All data for these analyses are freely available and, where possible, provided. Where reuse licences prohibit the republishing of data, citations are provided for the user to download the data. These analyses are very computationally demanding, and thus intermediate and output data products have been provided.
Authors: Jonathan D. Gordon*, Brennen Fagan, Jonathan Finch, Lindsey Gillson, Nicky Milner, Chris D. Thomas
*Corresponding author:
This work was funded by a Leverhulme Trust Research Centre - The Leverhulme Centre for Anthropocene Biodiversity, grant number: RC-2018-021 (JDG, BF, JF, LG, NM, CDT, all recipients).
Project description (dryad.tar)
Many of the analysis scripts for the manuscript 'Black Death land abandonment drove European diversity losses
' were run on the University of York's High Performance Computing cluster, Viking2.
Without using computing resources such as these, the computational time required to run these analyses in full is intractable, though the user may run the scripts in this folder on a reduced number of pollen records, with a reduced number of resamples. To ease the computational burden for those users looking to explore, reuse or repurpose the results and data, we have provided the most important data products in this repo., in /data_out.
Data
Raw data (in ./data_in)
- All pollen data are open access from Neotoma, accessed using the neotoma2 R package.
- REVEALS inputs available from Githumbi et al (2021), cited in-text.
- LRA code from Abraham et al (2014), cited in-text.
Processed data files (in ./data_out)
- Predicted ages (1,000 simulations) per pollen sample,
.data_out/CE_age_depth_mods_1kyr.RDS. - Final geochronological control table,
.data_out/CE_global_filtered_geochron_tables_1kyr.RDS. - Per record & sample REVEALS estimates,
.data_out/REVEALS_estimates_Neotoma_data.RDS. - All 1,000 diversity and composition resamples across the Common Era,
.data_out/BD_div_Viking_loaded.RDS. - Spatial partition of pollen records by Lang's European vegetation belts,
.data_out/pollen_record_LangVegZones.RDS. - Summarised GAM simulations,
.data_out/GAM_fits_summarised_2025-10-27. - Fitted models, predictions, fit and summaries underlying Fig. 3,
.data_out/Tree_rich_mods.RDS. - Fitted models, predictions, fit and summaries underlying Fig. 4,
.data_out/Lang_tree_rich_mods.RDS.
All analyses are run in R (R: A language and environment for statistical
computing. R Foundation for Statistical Computing, Vienna, Austria.URL
https://www.R-project.org/.)
Code
Run the scripts in the following order:
01_neotoma_download_Eur.R- Downloads pollen and geochronological data02_create_geochron_tables.R- Generates Common Era dataset of chron. controls03_HPC_runBchron.R- Computes age depth models04_HPC_load_age_depth_mods.R- Loads outputs from age-depth modelling into single file05_RV_file_format_Neotoma_data.R- Prepares pollen data for REVEALS06_RV_run_REVEALS_Neotoma_data.R- Runs REVEALS07_HPC_compute_diversity.R- Computes diversity and composition measures08_load_diversity_data.R- Loads data generated by HPC cluster09_Lang_PiP.R- Computes point (pollen record) in polygon (Lang's veg zones)10_HPC_GAM_fits_subset.R- Fits GAMs to each diversity dataset11_HPC_summarise_fit_draws.R- Loads and summarises GAM simulations12_metric_change_maps.R- Plots Fig. 213_plot_GAM_fits_subset.R- Plots distribution of GAMs (Figs. 1, 4 & SI 2)14_diversity_composition_models.R- Computes richness ~ tree relationships (Fig. 3)15_diversity_composition_models_LANG.R- Computes richness ~ tree relationships per veg zone (Fig. 4)16_plot_supplementary_figs.R- Plot SI Figs. 1 & 3
The .txt files used to send the HPC* scripts to a high performance computing cluster (BASH scripts) are provided for the user to use with their own HPC cluster and/or to give an indication of the memory usage/time allocation to run the full analysis in part locally.
Full data descriptions
| File | Description |
|---|---|
.data_out/CE_age_depth_mods_1kyr.RDS |
1,000 age simulations per level, per pollen record.Columnsdepths = depth of sample in sequence (in centremeters).median_age = median age in calibrated years before radiocarbon present across the 1,000 simulations.collunit = Neotoma unique collection unit ID.Columns 4 (draw_1) to 1004 (draw_1000) represent individual age simulations (in calibrated years before radiocarbon present) per level. |
./data_out/CE_filtered_geochron_tables.RDS |
Table of geochronological control points for those included records.Columnsdepth = depth of sample in sequence (in centremeters).thickness section thickness (in centremeters).agelimitolder = Upper age estimate (in years before radiocarbon present).chroncontrolid = Unique per-control ID.agelimityounger = Lower age estimate (in years before radiocarbon present).chroncontrolage = Central age estimate (in years before radiocarbon present).chroncontroltype = Type of chronological control (radiocarbon date, tephra, section top, etc.).chronologyid = Neotoma unique chronology ID.collunitid = Neotoma unique collection unit ID.datasetid = Neotoma unique pollen record ID.lat = Latitude in Decimal Degrees.long = Longitude in Decimal Degrees.calcurve = radiocarbon calibration curve.error = chronological control point error (SD).n_dates = Number of chronological control points per record.age_diff = Interval between successive chronological control points (in years).contin_run = Continuous run (Y/N, does sample meet chronological control sampling critera?).partial_record = Continuous run (T/F, does record meet chronological control sampling critera?).record_duration = total record duration (years) |
./data_out/REVEALS_estimates_Neotoma_data.RDS |
Past vegetation cover estimates per pollen sample, from the Regional Estimates of VEgetation Abundance from Large Sites model. Dataset is a list of 694 elements, with each element representing a pollen record. Each element is a dataframe of proportional cover estimates per taxon (each of which is represented by its own column) and all list elements have the same structure.Columns datasetid = Neotoma unique pollen record ID.br>depth = depth of sample in sequence (in centremeters).Columns 3 - 33 represent proportional vegetation estimates for individual taxa:"Abies.alba" "Alnus.glutinosa" "Amanranthaceae.Chenopodiaceae" "Artemisia" "Betula" "Buxus.sempervirens" "Calluna.vulgaris" "Carpinus.betulus" "Carpinus.orientalis" "Castanea" "Cerealia.t" "Corylus.avellana" "Cyperaceae" "Ericaceae" "Fagus.sylvatica" "Filipendula" "Fraxinus" "Juniperus" "Phillyrea" "Picea" "Pinus" "Pistacia" "Plantago.lanceolata.type" "Poaceae" "Quercus.deciduous" "Quercus.evergreen" "Rumex.acetosa.t" "Salix" "Secale" "Tilia" "Ulmus" |
.data_out/diversity_LUPi_resamples.RDS |
Full set of plant functional type estimates per sample and 1,000 richness, evenness, turnover resamples. Dataset is a list of two elements: div_vals = 1,000 richness, evenness, turnover resamples and pfts = plant functional type estimates.**Columns in div_vals:**analysis = Analysis run (REVEALS or RAW).resample_n = rarefaction value (number of pollen grains).datasetid = Neotoma unique pollen record ID.depth = depth of sample in sequence (in centremeters).age_draw_midpoint = midpoint age estimate between two levels of sequence (calibrated years before radiocarbon present).turn_norm = compositional change (Bray-Curtis).age_draw = age estimate (calibrated years before radiocarbon present).richness = richness (number of unique pollen types).evenness = evenness (dominance of abundances across pollen types, Pielou's J).n_chron_controls = Number of chronological control points per record.lat = Latitude in Decimal Degrees.long = Longitude in Decimal Degrees.n_samples = Number of samples (levels) per pollen sequence.pre_REVEALS_pollen_sum = Number of pollen grains per sample before REVEALS transformation of assemblages.cereals = REVEALS-modelled cereal counts.resample = resample (1-1000 simulations).**Columns in pfts:**resample_n = rarefaction value (number of pollen grains).datasetid = Neotoma unique pollen record ID.depth = depth of sample in sequence (in centremeters).age_draw = age estimate (calibrated years before radiocarbon present).PFT.definition = plant functional type description.total_count = count per PFT).resample = resample (1-1000 simulations). |
./data_out/pollen_record_LangVegZones.RDS |
Per pollen record spatial partition into Lang's vegetation zones.Columnsdatasetid = Neotoma unique pollen record ID.Vegetati_1 = Vegetation zone (minor).Veg_zones_new= Vegetation zone (major). |
./data_out/GAM_fits_summarised_2025-10-27.RDS |
Summaries of posterior simulation procedure. Dataset is a list of two elements, each of which has the same structure.Columnsresample_n = rarefaction value (number of pollen grains).region = region of Europe fits relate to.metric = pollen diversity/composition metric.cereal_dir = 14th century cereal farming trajectory.age_CE = years of the Common era..fitted = fitted smooth..lower = lower bound of interval..upper = upper bound of interval.resample_n = rarefaction value (number of pollen grains).width = probability coverage of interval.point = central tendency estimate type.interval = interval type |
./data_out/Tree_rich_mods.RDS |
{mgcv} model fit underlying Fig. 3 |
./data_out/Lang_tree_rich_mods |
{mgcv} model fits underlying Fig. 4 |
