15th century CE Bolivian maize reveals genetic affinities with ancient Peruvian maize
Data files
Jun 17, 2025 version files 292.44 MB
-
final_full_snps.vcf
292.44 MB
-
README.md
3.69 KB
Abstract
Previous archaeological and anthropological studies have demonstrated the myriad of ways that cultural and political systems shape access to food and food preferences. However, few studies have conducted a biocultural analysis linking specific genotypic/phenotypic traits as evidence of cultural selection in ancient contexts. Here, we provide insight into this topic through ancient genome data from Bolivia dating to ~500 BP, included as an offering with the mummified remains of a young girl, including 16 archaeological maize samples spanning at least 5,000 years of evolution, and 226 modern maize samples. Our phylogenetic analysis showed that the archaeological Bolivian maize (aBM) has the closest genetic distance to the archaeological maize from ancient Peru, which in turn shared the most similarities with archaeological Peruvian maize. The genetic differentiation implies that the Inca state aided maize diversity. The ovule development process was selected from modern maize and was compared to archaeological maize; where it indicates the breeding programs aimed at enhancing seed quality and yield in modern maize. Our study provides insights into the complex biocultural role that Inca Empire expansion, including its economic, symbolic, and religious cultural practices, may have had in driving the expansion of maize diversity in South America.
We have submitted our final SNP data (final_full_snps.vcf), R script (Geographic-Map.R, common-xpehh_SelectionVScontrol.R , f3.R, PCA.R, and targetgene_within5K.R), and Shell script (mapDamage-array.sh).
Description of the data and file structure
“final_full_snps.vcf”: contains the snps informations for 17 archaeological samples.
ancientaBMComb: archaeological Bolivian maize sample.
All the following samples information was obtained from previous publications:
https://doi.org/10.1073/pnas.2015560117
https://doi.org/10.1073/pnas.1609701113
ancientArica4: ancient Arica4 sample from Arica, Chile, coastal.
ancientArica5: ancient Arica4 sample from Arica, Chile, coastal.
ancientEG84: ancient EG84 sample from El Gigante rock shelter.
ancientEG85: ancient EG85 sample from El Gigante rock shelter.
ancientEG90: ancient EG90 sample from El Gigante rock shelter.
ancientSM10: ancient SM10 sample from Mexico: San Marcos cave, Tehuacan Valley.
ancientSM3: ancient SM3 sample from Mexico: San Marcos cave, Tehuacan Valley
ancientSM5: ancient SM10 sample from Mexico: San Marcos cave, Tehuacan Valley.
ancientTehuacan162: ancient SM10 sample from Mexico: San Marcos cave, Tehuacan Valley.
ancientZ2: ancien Z2 sample from Brazil:Peruacu Valley, Boquete cave.
ancientZ6: ancien Z2 sample from Brazil:Peruacu Valley, Boquete cave.
ancientZ61: ancient Z61 sample from Peru:Ancash.
ancientZ64: ancient Z64 sample from Peru:Ancash.
ancientZ65: ancient Z65 sample from Peru:Inca.
ancientZ66: ancient Z66 sample from Argentina:Catamerca.
ancientZ67: ancient Z67 sample from Argentina:Jujuy.
Code/software
R and Linux;
R is required to run all XXX.R scripts; the script was created using R version R version 4.4.1.
Linux is required to run XXX.sh shell scripts. All scripts available on Zenodo under Software Related Work link.
mapDamage-array.sh: It was used to verify cytosine deamination profiles consistent with ancient DNA, and estimated the misincorporation frequencies for all ancient maize samples.
Geographic-Map.R: It was used to util latitude and longitude information, the samples were plotted on a geographic map. Geographic data visualization was carried out in R using the ggplot2, sf, leaflet, rnaturalearth, rnaturalearthdata, ggspatial, and rnaturalearthhires packages.
f3.R: The admixr package, which is part of the ADMIXTURE software suite, was used to detect evidence of admixture involving aBM.
PCA.R: Principal Component Analysis (PCA) was applied for all ancient and modern mazies.
targetgene_within5K.R: It was used to identify target genes within a 5kb distance.
xpehh_controlVScase.R: It was used to find selection between ancient and modern maize populations.
Access information
Other publicly accessible locations of the data:
- Dryad
Data was derived from the following sources:
-
All the following samples information except archaeological Bolivian maize sample was obtained from previous publications:
-
All the modern samples was obtained from the following publication: