Hiding in plain sight: The biomolecular identification of pinniped use in medieval manuscripts – MALDI and mtDNA data set
Data files
Apr 02, 2025 version files 6.45 GB
-
PRI_LT7L_EL51_S10_L003_R1_001_cutadapt.clean.fastq.gz
764.01 MB
-
PRI_LT7L_EL52_S11_L003_R1_001_cutadapt.clean.fastq.gz
904.93 MB
-
PRI_LT7L_EL53_S12_L003_R1_001_cutadapt.clean.fastq.gz
1.08 GB
-
PRI_LT7L_EL54_S13_L003_R1_001_cutadapt.clean.fastq.gz
791.08 MB
-
PRI_LT7L_EL55_S14_L003_R1_001_cutadapt.clean.fastq.gz
1.26 GB
-
PRI_LT7L_EL57_S15_L003_R1_001_cutadapt.clean.fastq.gz
821.46 MB
-
PRI_LT7L_EL58_S16_L003_R1_001_cutadapt.clean.fastq.gz
801.44 MB
-
README.md
2.62 KB
-
RSOS-241090_eZooMS_data.zip
29.35 MB
-
seal_all_mtDNA.fa
85.56 KB
Abstract
Medieval manuscripts still in their original bindings are rare. Taking advantage of the diversity of bindings in Cistercian libraries during the 12th and 13th centuries, this study focuses on the biocodicological analysis of medieval bindings, with particular emphasis on the use of sealskins. Using innovative methods such as eZooMS and aDNA analysis, this research identifies the species and origin of the leather used as pinniped (seal) species. In particular, the collagen-based eZooMS technique facilitated the classification of seven chemises into the pinniped clade, although species identification remained elusive, with one exception.
aDNA analysis was instrumental in verifying the origin of the sealskins, with most samples matching harbour and harp seals and sourced to populations in Scandinavia, Scotland, and Iceland or Greenland. This supports the notion of a robust medieval trade network that went well beyond local sourcing, linking the Cistercians to wider economic circuits that included fur trade with the Norse. The study highlights the use of an unexpected skin (seal) from an unexpected source (the northwestern Atlantic). The widespread use of sealskins in Cistercian libraries during the 12th and 13th centuries hints at broader trade networks which brought, for example, walrus ivory from the far north into continental Europe.
https://doi.org/10.5061/dryad.kprr4xhd6
RSOS-241090_eZooMS_data.zip
The folder contains MALDI-TOF data exported in txt format from eZooMS analysis of tawed skin samples. The first column indicates m/z and the second column indicates intensity.
List of the samples analysed with eZooMS (MALDI) and aDNA (mtDNA) of manuscripts from the abbeys of Clairvaux (Troyes library), Clairmarais (St Omer library) and Vauclair (Laon Library).
Sample number | Call number | ID technique |
---|---|---|
EL07 | Troyes 5 | eZooMS |
EL53 | Troyes 31 | eZooMS, aDNA |
EL51 | Troyes 35 | eZooMS, aDNA |
EL52 | Troyes 40,7 | eZooMS, aDNA |
EL55 | Troyes 40,8 | eZooMS, aDNA |
EL54 | Troyes 43,3 | eZooMS, aDNA |
EL57 | Saint-Omer 701 | eZooMS, aDNA |
EL58 | Saint-Omer, ms 37 | eZooMS, aDNA |
Laon01 | Ms 176 | eZooMS |
Laon02 | Ms 8bis | eZooMS |
Laon03 | Laon 166 | eZooMS |
seal_all_mtDNA.fa
consensus DNA sequences in fasta format for the samples used in the phylogenetic analysis presented in this study.
krona visualizations (Zenodo)
Krona visualization of the metagenomic data recovered from each sample presented in this study.
FastQ files
- PRI_LT7L_EL51_S10_L003_R1_001_cutadapt.clean.fastq.gz - Sequencing data for sample EL51 in fastq format adapter trimmed and host sequences removed.
- PRI_LT7L_EL52_S11_L003_R1_001_cutadapt.clean.fastq.gz - Sequencing data for sample EL52 in fastq format adapter trimmed and host sequences removed.
- PRI_LT7L_EL53_S12_L003_R1_001_cutadapt.clean.fastq.gz - Sequencing data for sample EL53 in fastq format adapter trimmed and host sequences removed.
- PRI_LT7L_EL54_S13_L003_R1_001_cutadapt.clean.fastq.gz - Sequencing data for sample EL54 in fastq format adapter trimmed and host sequences removed.
- PRI_LT7L_EL55_S14_L003_R1_001_cutadapt.clean.fastq.gz - Sequencing data for sample EL55 in fastq format adapter trimmed and host sequences removed.
- PRI_LT7L_EL57_S15_L003_R1_001_cutadapt.clean.fastq.gz - Sequencing data for sample EL57 in fastq format adapter trimmed and host sequences removed.
- PRI_LT7L_EL58_S16_L003_R1_001_cutadapt.clean.fastq.gz - Sequencing data for sample EL58 in fastq format adapter trimmed and host sequences removed.
Samples
An initial focus was placed on manuscripts bound with leather with hairs on, all dated from 1135 to 1250, 19 from the collection of Clairvaux manuscripts, and three from other Cistercian collections, Clairmarais and Vauclair, enhancing the understanding of material composition and historical provenance across these collections.
Sampling was conducted on the flesh side of the leather to avoid contamination from keratin, ensuring only collagen fibres were collected. A representative set of nine manuscripts was selected for eZooMS analysis (including six from Clairvaux), increasing the accuracy of visual identification techniques for the broader collection. Subsequently, seven of these samples were sequenced to provide a comparative analysis [Table 3].
Sample number |
Call number |
ID technique |
EL07 |
Troyes 5 |
eZooMS |
EL53 |
Troyes 31 |
eZooMS, aDNA |
EL51 |
Troyes 35 |
eZooMS, aDNA |
EL52 |
Troyes 40,7 |
eZooMS, aDNA |
EL55 |
Troyes 40,8 |
eZooMS, aDNA |
EL54 |
Troyes 43,3 |
eZooMS, aDNA |
EL57 |
Saint-Omer 701 |
eZooMS, aDNA |
EL58 |
Saint-Omer, ms 37 |
eZooMS, aDNA |
Laon01 |
Ms 176 |
eZooMS |
Laon02 |
Ms 8bis |
eZooMS |
Laon03 |
Laon 166 |
eZooMS |
Protein analysis
Samples were collected using PVC erasers, following the procedure outlined in Fiddyment et al. 2015. Using the established eZooMS technique, samples were first processed with MALDI-TOF MS. The choice of eZooMS is justified by the heritage nature of the collection studied, which requires the use of non-destructive (or micro-invasive) analysis techniques. It also has the advantage of not requiring transportation of the manuscripts from their current storage.
DNA analysis
Extraction
DNA was extracted from eraser crumb samples taken from each document following the protocol of Teasdale et al, in a dedicated ancient DNA laboratory at BioArCh, University of York. Illumina sequencing libraries were prepared for each sample following the protocol of Meyer and Kircher as modified by Gamba. Libraries were then sequenced on two lanes of a HiSeq4000 at the University of Copenhagen.
Sequencing and bioinformatics
The two FastQ files for each sample were first quality checked using FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and trimmed for adapter sequences using cutadapt. Sequences were first aligned to multiple reference genomes using FastQ Screen (supplementary table 1) and then to the predicted source species using BWA (using recommended settings for aDNA. Alignments were then filtered for PCR duplicates and low mapping quality reads using SAMtools, quality controlled using Qualimap 2 and visualised in IGV (Integrative genomics viewer). Final fasta consensus sequences for each sample were produced using angsd and visualised using Seaview.
Haplotype network and phylogenetic analysis
The likely geographical origin of the four harbour seal binders was inferred through construction of a median-joining haplotype network using PopART and a reference dataset of 954 mtDNA haplotypes (422 bp) from contemporary North Atlantic harbour seals compiled from previous studies, and unpublished data (Olsen MT). The phylogenetic analysis on sample EL58, identified as harp seal, was carried out in MEGA7. EL58 was aligned with 53 contemporary North Atlantic harp seal mitochondrial genomes. In MEGA, the alignment was conducted using MUSCLE, and a Maximum Likelihood phylogeny was constructed based on the Hasegawa-Kishino-Yano model with 1000 bootstrap repetitions.
Metagenomic Profiling
Metagenomic profiling was undertaken on adapter-trimmed and de-duplicated reads by k-mer alignment, using the ConClave sorting method to resolve multi-mapping reads from KMA following the approach in CCMetagen. This used the NCBI nt database as a reference dataset, without unclassified environmental sequences. CCMetagen was found to produce a significantly reduced rate of false positives in comparison with other k-mer based classification methods. KMA was run with the following options in effect: “kma -1t1 -mem_mode -and -apm f”.