The burial complex of the Imdang-Joyeong site at Gyeongsan in southeastern Korea is notable for the large number of tombs constructed within ~100 years (4th-6th centuries CE) as well as widespread practice of human sacrifice. Analyzing genome-wide data from 78 individuals, we detected 11, 23, 20 pairs of the first, second, and three-or-more-distant degree relatives, respectively, revealing a dense network of kinship in the Imdang-Joyeong society. We found 5 individuals from closely related parents, suggesting the practice of consanguineous marriage in both grave owners and the sacrificed. We also observed adult female descendants buried together with their kin, unlike several recent archaeogenetic studies in Europe reporting a strict pattern of female exogamy. We detected no discernible genetic difference between grave owners and the sacrificed. Our analysis provides novel bioarchaeological information on the burial customs and social structure of the Three-Kingdoms period society in Korea.

Dataset DOI: 10.5061/dryad.fj6q57489

Description of the data and file structure

This repository contains genotype data, scripts, and output files used for the analyses in "Ancient genomes reveal an extensive kinship network and endogamy in a Three-Kingdoms period society in Korea".

Files and variables

File: calculate_PMR.7z

Description: Codes used to calculate Pairwise Mismatch Rate from pseudohaploid data. Used for Supplementary Data S4 and S10.

[run_pmr.sh]: Wrapper for running "Calculate_PMR_matrix.R" in Unix shell

[Calculate_PMR_matrix.R]: Code for calculating PMR from eigenstrat format.

File: eigenstrat_1240K_Imdang_Joyeong.7z

Description: 1240K Eigenstrat format file of ancient individuals from the Imdang-Joyeong burial complex newly reported from this study. Consists of geno file, snp file, and ind file.

File: run_admixtools2_f3_f4_server.7z

Description: R script for running the f3, f4 command in admixtools 2 on a HPC server, with f3 and f4 statistics results for figures S5 and S6, Supplementary Data S6, S7, S8

[f4_compare.R], [outgroupf3.R]: R script for calculating f4 and f3 statistics, respectively.

csv files: f3 and f4 values calculated using [f4_compare.R] and [outgroupf3.R], equal to information in supplementary data S6, S7, and S8.

File: run_admixtools2_qpAdm_server.7z

Description: R script for running qpAdm command in admixtools2 on a HPC server, used for figure 5 and Supplementary Data S9.

[commandline_qpadm.sh]: main script for running script in commandline.

[runqpadm.wrapper.sh] : helper script for running qpadm_parallel_250331.R.

[qpadm_parallel_250331.R]: R script for running qpadm function in admixtools2 R library.

File: PCA.7z

Description: Eigenvalues, Eigenvector calculations, Populations used in PC calculation, and the main code for running smartPCA for figures 5 and S9.

[PCA_240516.eval.txt.gz]: Eigenvalue calculated through smartPCA.

[PCA_240516_X.evec.txt.gz]: Eigenvector calculated through smartPCA.Number corresponds to the _X.pops file which specifies the populations used for PC calculation.

[PCA_code.sh]: script used to run smartPCA

[PCA_240516_X.pops]: Modern population sets used to calculate smartPCA

File: run_GLIMPSE_code.7z

Description: Code for running GLIMPSE for ancient DNA imputation required for our IBD analysis.

[commandline_GLIMPSE.sh] :Main code to run GLIMPSE pipeline.

[wrapper_XXX.sh]: Helper scripts used to prepare and run GLIMPSE pipeline.

[commandline_1000GP_panel.sh]: Script to prepare 1000GP_Phase3 data to use as a reference panel for GLIMPSE

File: run_ped-sim_compare_ancIBD.7z

Description: Codes and results for simulating IBD sharing between close kin to compare with ancIBD results using ped-sim, used for Figure S6.

[ped-sim_commandline.sh]: script used to run ped-sim in UNIX environment

[merge_IBD_blocks.R]: R script to merge consequative IBD1 and IBD2 blocks to match ancIBD output.

[ancibd_background.seg]: Segment information (output) of ped-sim

[ancibd_background.def]: Family relationships specified for ped-sim.

[ancibd_background.csv]: seg file in csv format.

[1240K.240829.snp] SNP file of 1240K sites used for filtering ped-sim results to match ancIBD output.

File: ancIBD_codes_results.7z

Description: Codes and results for ancIBD, Cytoscape file for network analysis and visualization, R script for statistical analysis of IBD network. Used for figures 4, S6, S7, and S8.

[ancIBD_commandline.sh]: Code used to run ancIBD in Unix environment.

[ancIBD_240720_res.tsv, ancIBD_240720_ch_all.tsv]: IBD sharing information inferred from ancIBD

[Imd_Joy_240914.cys]: Cytoscape file used to analyze and visualize IBD network.

[adultnode_network_statistics.csv]: CSV file exported from cytoscape containing Network statistics information such as degree centrality. 1240Kcount is the number of 1240K SNPs that were genotyped, and all1240K is the total number of 1240K SNPs. All other values do not have a unit of measure.

[network_analysis.R]: R script used to perform statistical analysis on network.

File: KIN_script.bash

Description: Code used to run KIN for figure 3, Supplementary Data S4 and S10.

File: ped-sim_inbreeding.7z

Description: Code and results for simulating ROH between four scenarios of inbreeding using ped-sim, used for figure s5.

[.def]: Files specifying kinship scenarios for simulation using ped-sim

[.seg]: Segment files of IBD and ROH information simulated using ped-sim

[run_pedsim_commandline.sh] Script used to run ped-sim in Unix environment.

File: hapROH.7z

Description: Code and result for estimating ROH from pseudohaploid data using hapROH, used for figures 2 and S4.

[Commandline_Imdang_hapROH_script.bash]: Main script for running hapROH

[run_hapROH.py, postprocess_hapROH.py]: Helper scripts for running hapROH

[Imdang_hapROH.combined.tsv]: ROH information of individuals generated by hapROH.

[individual_ROH]: Tables containing output of ROH information inferred through hapROH.

File: 1240K.KIN.bedfile.bed

Description: bed file used for KIN_script.bash. Format is 5 columns tab delimited with Chromosome, 1240K SNP position 0 based, 1240K SNP position end, Reference allele, and Alternative Allele.

Code/software

All shell script are written in using Bash in mind.

For high performance computing, we used slurm 22.05.11 as a scheduler.

All other program versions and software used are described in the Material and Methods section of our manuscript.

Access information

Other publicly accessible locations of the data:

https://github.com/CWJeongLab/Imdang_Joyeong

Data from: Ancient genomes reveal an extensive kinship network and endogamy in a Three-Kingdoms period society in Korea

Data files

Abstract

Description of the data and file structure

Files and variables

File: calculate_PMR.7z

File: eigenstrat_1240K_Imdang_Joyeong.7z

File: run_admixtools2_f3_f4_server.7z

File: run_admixtools2_qpAdm_server.7z

File: PCA.7z

File: run_GLIMPSE_code.7z

File: run_ped-sim_compare_ancIBD.7z

File: ancIBD_codes_results.7z

File: KIN_script.bash

File: ped-sim_inbreeding.7z

File: hapROH.7z

File: 1240K.KIN.bedfile.bed

Code/software

Access information

Data from: Ancient genomes reveal an extensive kinship network and endogamy in a Three-Kingdoms period society in Korea

Data files

Abstract

README: Data from: Ancient genomes reveal an extensive kinship network and endogamy in a Three-Kingdoms period society in Korea

Description of the data and file structure

Files and variables

File: calculate_PMR.7z

File: eigenstrat_1240K_Imdang_Joyeong.7z

File: run_admixtools2_f3_f4_server.7z

File: run_admixtools2_qpAdm_server.7z

File: PCA.7z

File: run_GLIMPSE_code.7z

File: run_ped-sim_compare_ancIBD.7z

File: ancIBD_codes_results.7z

File: KIN_script.bash

File: ped-sim_inbreeding.7z

File: hapROH.7z

File: 1240K.KIN.bedfile.bed

Code/software

Access information