Ancient genomes from eastern Kazakhstan reveal dynamic genetic legacy of Inner Eurasian hunter-gatherers
Data files
Sep 23, 2025 version files 75.61 MB
-
DataProcessing_MPI_batch17_script_250225_TF1-2.txt
35.34 KB
-
Koken_ancients_21.geno
27.13 MB
-
Koken_ancients_21.ind
556 B
-
Koken_ancients_21.snp
43.35 MB
-
Koken_DATES_script_250425_2.txt
3.06 KB
-
Koken_DATES_script_250425.txt
3.09 KB
-
Koken_EN_2_imputation_250228.txt
10.41 KB
-
Koken_EN_qpAdm_OGs_script_250512.txt
42.64 KB
-
Koken_EN_qpAdm_script_241226.txt
7.35 KB
-
Koken_EN_run_ancIBD_250304.txt
1.35 KB
-
Koken_f3_f4_PCA_script_250305.txt
9.71 KB
-
Koken_Figure_2_map_Lat_Lon_coordinates.csv
2.63 KB
-
Koken_Figure_2_PCA.eval
29.51 KB
-
Koken_Figure_2_PCA.evec
2.03 MB
-
Koken_Figure_2_PCA.evec.txt.gz
319.18 KB
-
Koken_Figure_2_Radiocarbon_dates.csv
9.01 KB
-
Koken_MLBA_PMR_script_250508.txt
5.60 KB
-
Koken_MLBA_qpAdm_script_250522.txt
17.02 KB
-
Koken_Supplementary_Figure_12_PCA.eval
16.09 KB
-
Koken_Supplementary_Figure_12_PCA.evec
2.05 MB
-
Koken_Supplementary_Figure_12_PCA.evec.txt.gz
327.65 KB
-
Koken_Supplementary_Figure_7_compare_ancIBD_pedsim.zip
2.18 KB
-
Koken_Supplementary_Figure_7_pedsim_simulated_IBD.txt
31.91 KB
-
Koken_Supplementary_Figure_8_hapROH.csv
1.64 KB
-
README.md
7.42 KB
-
run_ancIBD_to_detect_IBD2_241228.txt
1.38 KB
-
Supplementary_Figure_15_DATES_Koken_MLBA_o1.Krasnoyarsk_MLBA_Irtysh_UpperOb_HG.txt
33.07 KB
-
Supplementary_Figure_15_DATES_Koken_MLBA_o2.Krasnoyarsk_MLBA_Irtysh_UpperOb_HG.txt
33.07 KB
-
Supplementary_Figure_15_DATES_Koken_MLBA.Sintashta_MLBA_Irtysh_UpperOb_HG.txt
33.06 KB
-
Supplementary_Figure_15_DATES_Krasnoyarsk_MLBA.Sintashta_MLBA_Irtysh_UpperOb_HG.txt
33.07 KB
-
Supplementary_Figure_15_DATES_Zevakinskiy_LBA.Krasnoyarsk_MLBA_Irtysh_UpperOb_HG.txt
33.07 KB
Abstract
Dataset DOI: 10.5061/dryad.4xgxd25nx
Description of the data and file structure
This repository contains genotype data and scripts used for the analyses in "Ancient genomes from eastern Kazakhstan reveal dynamic genetic legacy of Inner Eurasian hunter-gatherers".
Files and variables
File: Koken_Supplementary_Figure_12_PCA.eval
Description: This file contains the eigenvalues from the PCA performed using present-day western Eurasian individuals (see Supplementary Figure 11, 12 and 14).
File: Koken_Figure_2_map_Lat_Lon_coordinates.csv
Description: This file contains the meta data used for plotting the map in Figure 2 (see Figure 2).
Variables
- Study Group: Population ID
- Latitude: Latitude of the excavated site
- Longitude: Longitude of the excavated site
- Age average: Average radiocarbon date of the excavated human remains (i.e., the estimated time period when the individuals lived, not their age at death)
- pch: pch parameter used to represent this population in R plots
- color: col or bg parameter used to represent this population in R plots
- IID: Individual ID
File: Koken_Figure_2_PCA.evec.txt.gz
Description: This file contains eigenvectors from the PCA performed using present-day Eurasian (see Figure 2, Supplementary Figure 11 and 14).
File: Koken_Figure_2_Radiocarbon_dates.csv
Description: This file contains the meta data used for plotting the map in Figure 2 (see Figure 2).
Variables
- Study Group: Population ID
- Order: The order in which populations should be added (e.g., to plots) in R
- Latitude: Latitude of the excavated site
- Longitude: Longitude of the excavated site
- Dates: Average radiocarbon date of the excavated human remains
- pch: pch parameter used to represent this population in R plots
- color: col or bg parameter used to represent this population in R plots
- IID: Individual ID
File: Koken_ancients_21.ind
Description: ind file in EIGENSTRAT format for the 21 individuals from the Koken site.
File: Koken_Figure_2_PCA.eval
Description: This file contains eigenvalues from the PCA performed using present-day Eurasian (see Figure 2, Supplementary Figure 11 and 14).
File: Koken_Figure_2_PCA.evec
Description: This file contains eigenvectors from the PCA performed using present-day Eurasian (see Figure 2, Supplementary Figure 11 and 14).
File: Koken_Supplementary_Figure_12_PCA.evec.txt.gz
Description: This file contains the eigenvectors from the PCA performed using present-day western Eurasian individuals (see Supplementary Figure 11, 12 and 14).
File: Koken_Supplementary_Figure_12_PCA.evec
Description: This file contains the eigenvectors from the PCA performed using present-day western Eurasian individuals (see Supplementary Figure 11, 12 and 14).
File: Koken_ancients_21.snp
Description: snp file in EIGENSTRAT format for the 21 individuals from the Koken site.
File: Koken_ancients_21.geno
Description: geno file in EIGENSTRAT format for the 21 individuals from the Koken site.
File: Supplementary_Figure_15_DATES_Koken_MLBA_o2.Krasnoyarsk_MLBA_Irtysh_UpperOb_HG.txt
Description: Results from the DATES program modeling Koken_MLBA_o2 as an admixture between Krasnoyarsk_MLBA and Irtysh_UpperOb_HG (see Figure 4 and Supplementary Figure 15).
File: Supplementary_Figure_15_DATES_Zevakinskiy_LBA.Krasnoyarsk_MLBA_Irtysh_UpperOb_HG.txt
Description: Results from the DATES program modeling Zevakinskiy_LBA as an admixture between Krasnoyarsk_MLBA and Irtysh_UpperOb_HG (see Figure 4 and Supplementary Figure 15).
File: Supplementary_Figure_15_DATES_Koken_MLBA_o1.Krasnoyarsk_MLBA_Irtysh_UpperOb_HG.txt
Description: Results from the DATES program modeling *Koken_MLBA_o1 *as an admixture between Krasnoyarsk_MLBA and Irtysh_UpperOb_HG (see Figure 4 and Supplementary Figure 15).
File: Supplementary_Figure_15_DATES_Koken_MLBA.Sintashta_MLBA_Irtysh_UpperOb_HG.txt
Description: Results from the DATES program modeling *Koken_MLBA *as an admixture between Sintashta_MLBA and Irtysh_UpperOb_HG (see Figure 4 and Supplementary Figure 15).
File: Supplementary_Figure_15_DATES_Krasnoyarsk_MLBA.Sintashta_MLBA_Irtysh_UpperOb_HG.txt
Description: Results from the DATES program modeling *Krasnoyarsk_MLBA *as an admixture between Sintashta_MLBA and Irtysh_UpperOb_HG (see Figure 4 and Supplementary Figure 15).
File: Koken_Supplementary_Figure_8_hapROH.csv
Description: Results from the hapROH program detecting runs of homozygosity (ROH) blocks (see Supplementary Figure 8).
File: Koken_Supplementary_Figure_7_pedsim_simulated_IBD.txt
Description: Results from the pedsim program simulating IBD segments for specified kinship relationships (see Supplementary Figure 7).
File: DataProcessing_MPI_batch17_script_250225_TF1-2.txt
Description: The script used for processing raw sequencing data from initial quality control to alignment and filtering.
File: Koken_EN_2_imputation_250228.txt
Description: The script used to run GLIMPSE2 to impute 2 Koken EN individuals.
File: Koken_EN_run_ancIBD_250304.txt
Description: The script used to run ancIBD program between Koken_EN1 and Koken_EN2.
File: run_ancIBD_to_detect_IBD2_241228.txt
Description: The script used to detect IBD2 between Koken_EN1 and Koken_EN2.
File: Koken_MLBA_PMR_script_250508.txt
Description: The script used to calculate pairwise mismatch rates (PMR) to investigate kinship among Koken individuals.
File: Koken_Supplementary_Figure_7_compare_ancIBD_pedsim.zip
Description: The script and input file used to run pedsim to investigate the potential kinship between two Koken Early Neolithic individuals (see Supplementary Figure 7)
File: Koken_MLBA_qpAdm_script_250522.txt
Description: The script used to run qpAdm to investigate the admixture sources and proportions of the newly sequenced Koken MLBA individuals (see Figure 4).
File: Koken_EN_qpAdm_OGs_script_250512.txt
Description: The script used to run qpAdm to investigate the genetic profiles of the newly sequenced Koken EN individuals (see Figure 3 and Supplementary Figure 13).
File: Koken_EN_qpAdm_script_241226.txt
Description: The script used to run qpAdm to investigate the genetic profiles of the newly sequenced Koken EN individuals (see Figure 3).
File: Koken_f3_f4_PCA_script_250305.txt
Description: The script used to run PCA, f3-, and f4-statistics to investigate the genetic profiles of the newly sequenced Koken individuals (see Figure 2, Supplementary Figure 9-12 and 14).
File: Koken_DATES_script_250425.txt
Description: The script used to run the DATES program for estimating admixture dates (see Figure 4 and Supplementary Figure 15).
File: Koken_DATES_script_250425_2.txt
Description: The script used to run the DATES program for estimating admixture dates (see Figure 4 and Supplementary Figure 15).
Access information
Other publicly accessible location of the data: https://github.com/CWJeongLab/Koken
