Raw data for predicting sample success for large-scale ancient DNA studies on marine mammals
Data files
Jan 13, 2021 version files 91.93 KB
Abstract
In recent years, non-human ancient DNA studies have begun to focus on larger sample sizes and whole genomes, offering the potential to reveal exciting and hitherto unknown answers to ongoing biological and archaeological questions. However, one major limitation to the feasibility of such studies is the substantial financial and time investments still required during sample screening, due to uncertainty regarding successful sample selection. This study investigates the effect of a wide range of sample properties including latitude, sample age, skeletal element, collagen preservation, and context on endogenous content and DNA damage profiles for 317 ancient and historic pinniped samples collected from across the North Atlantic. Using generalised linear and mixed-effect models, we found that a range of factors affected DNA preservation within each of the species under consideration. The most important findings were that endogenous content varied significantly according to context, the type of skeletal element, the collagen content and collection year. There also appears to be an effect of the sample’s geographic origin, with samples from the Arctic generally showing higher endogenous content and lower damage rates. Both latitude and sample age were found to have significant relationships with damage levels, but only for walrus samples. Sex, ontogenetic age and extraction material preparation were not found to have any significant relationship with DNA preservation. Overall, the skeletal element and sample context were found to be the most influential factors and should therefore be considered when selecting samples for large-scale ancient genome studies.
Methods
Full details regarding data collection and processing can be found in the Molecular Ecology Resources article.
Usage notes
Samples listed are all those included in the study, not only those that were successfully sequenced.
Values for certain samples are not available if the sample was collected but did not provide a usable extraction. In such cases fields are filled with 'NOT SCREENED'.
As this represents a large number of samples of different species, collected by different individuals and prepared in two different labs not all information is available for some samples as this was never available/collected. In such cases either 'NA' or 'Unknown' is used.
The first row provides a header for the table, with the second row providing a more detailed description of the data found in each column.