Ancient dog mitogenomes support the dual dispersal of dogs and agriculture into South America
Data files
May 12, 2025 version files 1.23 MB
-
metadata.csv
20.43 KB
-
mtDNA_70indiv.fasta
1.21 MB
-
README.md
3.73 KB
Abstract
Archaeological and palaeogenomic data show that dogs were the only domestic animals introduced during the early peopling of the Americas. Hunter-gatherer groups spread quickly toward the south of the continent, but it is unclear when dogs reached Central and South America. To address this issue, we generated and analysed 70 complete mitochondrial genomes from archaeological and modern dogs ranging from Central Mexico to Central Chile and Argentina, revealing the dynamics of dog populations. Our results demonstrate that that pre-contact Central and South American dogs are all assigned to a specific clade that diverged after dogs entered North America. Specifically, the divergence time between North, Central, and South American dog clades is consistent with the spread of agriculture and the adoption of maize in South America between 7,000 and 5,000 years ago. An isolation-by-distance best characterizes how dogs expanded into South America. We identify the arrival of new lineages of dogs in post-contact South America, likely of European origin, and their legacy in modern village dogs. Interestingly, the pre-contact Mesoamerican maternal origin of the Chihuahua has persisted in some modern individuals.
https://doi.org/10.5061/dryad.ffbg79d28
This dataset contains the raw data (mitochondrial genome alignments) associated with the paper.
Description of the data and file structure
mtDNA_70indiv.fasta contains the 70 mitochondrial genomes that were aligned. Sample names and contexts are reported in the metadata.csv associated file.
metadata includes the following fields:
- Sample ID = Sample unique identifier connected to the curation catalog
- Lab code = Unique identifier created for the laboratory procedure (DNA extraction)
- Processing laboratory = Name of the laboratory in which the samples were processed (MNHN = Paris, France; PalaeoBARN = Oxford, UK)
- Country/Region = Geographical origin of the sample
- Archaeological site = Name of the archaeological site that yielded the samples
- Lat. = Latitude of the archaeological site, decimal coordinate
- Long. = Longitude of the archaeological site, decimal coordinate
- Archaeological context = Stratigraphical context in which the samples have been found
- Category of deposit = Detail of the contextual origin of the sample
- Reference for archaeological context = Bibliographic reference for the archaeological context
- Age = Biological age of the individual, when known (adult / juvenile)
- Anatomical part = Anatomical part from which the sample have been taken
- GMM code = Associated code for previous geometric morphometrics study, if applicable
- Cultural period = Corresponding cultural period
- Dating = Estimated calendar age, based on archaeological context
- Average date bp (for BEAST model) = Average date, expressed "before present", that has been used for the Bayesian model, if applicable
- AMS dating - lab code = Unique lab identifier for the direct AMS radiocarbon date, if applicable
- Mitochondrial capture = Sample used in mitochondrial capture, 1 (yes) or 0 (no)
- Screening endogenous DNA (%) = Percentage of endogenous DNA based on screening (low depth sequencing)
- Haystac identification = Species identification, based on the result of the competitive alignment performed with Haystac
- Total number of mitochondrial reads (post capture and/or shotgun sequencing) = Total number of mitochondrial reads obtained from mitochondrial capture (if "Mitochondrial capture" = 1) or shotgun sequencing
- Mitochondrial depth of coverage at q30 = Resulting depth of coverage of the mitochondrial genome, only accounting for reads of a minimum quality of 30
- Number of bases covered at q30 = Number of bases covered on the mitochondrial genome, only accounting for reads of a minimum quality of 30
- Mitochondrial breadth of coverage (%) = Resulting breadth of coverage of the mitochondrial genome, expressed in percent, only accounting for reads of a minimum quality of 30
- MapDamage2 C>T 5' in 1st position = Deamination of the 5' end of the DNA expressed in C to T transition, calculated with MapDamage2
- Used to build Maximum Likelihood tree = Sample used to build Maximum Likelihood tree in corresponding publication, 1 (yes) or 0 (no)
- Reassigned phylogenetic position based on constrained ML tree = Phylogenetic position of the sample assessed from a constrained Maximum Likelihood tree in corresponding publication, 1 (yes) or 0 (no)
- Used in BEAST tree = Sample used to produce a time modelled Bayesian phylogenetic tree using BEAST2 in corresponding publication, 1 (yes) or 0 (no)
- Sampling contact = name of the author(s) of the dataset to contact for further information on the physical sample and its context
Sharing/Access information
None
Code/Software
None
- Manin, Aurélie; Debruyne, Regis; Lin, Audrey et al. (2025). Ancient dog mitogenomes support the dual dispersal of dogs and agriculture into South America. Proceedings of the Royal Society B: Biological Sciences. https://doi.org/10.1098/rspb.2024.2443
