Data from: Inference of selective force on house mice genomes during secondary contact in East Asia
Data files
Mar 12, 2024 version files 166.16 MB
-
README.md
-
Supplemental_Data.zip
Abstract
The house mouse (Mus musculus), commensal to humans, has spread globally via human activities, leading to secondary contact between genetically divergent subspecies. This pattern of genetic admixture can provide insights into the selective forces at play in this well-studied model organism. Our analysis of 163 house mouse genomes, mainly from East Asia, revealed substantial admixture between the subspecies castaneus and musculus, particularly in Japan and southern China. We revealed, despite the admixture, that all Y chromosomes in the East Asian samples belonged to the musculus-type haplogroup, potentially explained by genomic conflict under sex ratio distortion due to varying copy numbers of ampliconic genes on sex chromosomes. We also investigated the influence of natural selection on the post-hybridization of the subspecies castaneus and musculus in Japan. Even though the genetic background of most Japanese samples closely resembles the subspecies musculus, certain genomic regions overrepresented the castaneus-like genetic components, particularly in immune-related genes. Furthermore, a large genomic block containing a vomeronasal/olfactory receptor gene cluster predominantly harbored castaneus-type haplotypes in the Japanese samples, highlighting the possible role of olfaction-based recognition in shaping hybrid genomes.
README: Title of Dataset
Inference of Selective Force on House Mice Genomes during Secondary Contact in East Asia
Analysis of 163 house mouse genomes, mainly from East Asia, revealed considerable admixture between the castaneus and musculus subspecies, especially in Japan and southern China. Despite the admixture, all Y chromosomes in the East Asian samples belonged to the musculus haplogroup, which could be explained by genomic alleles under sex ratio distortion due to different copy numbers of amplified genes on sex chromosomes.
This dataset contains configuration and script files for the simulation of X-chromosome, Y-chromosome, and mitochondrial genome introgression under sex ratio distortion, as well as for the demographic inference and admixture simulation of Mus musculus.
First Author Information
Name: Kazumichi Fujiwara
Institution: (1) Graduate School of Information Science and Technology, Hokkaido University/(2) Mouse Genomics Resource Laboratory, National Institute of Genetics
Address: (1) Kita 14, Nishi 9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan/(2) 1111 Yata, Mishima, Shizuoka, 411-8540, Japan
ORCID: https://orcid.org/0000-0002-7840-4676
Email: kazumichi.m.fujiwara@gmail.com
Corresponding author Information/Principal Investigator
Name: Naoki Osada
Institution: Graduate School of Information Science and Technology, Hokkaido University
Address: Kita 14, Nishi 9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan
ORCID: https://orcid.org/0000-0003-0180-5372
Email: nosada@ist.hokudai.ac.jp
Information about funding sources that supported our research
The Ministry of Education,Culture,Sports,Science and Technology (MEXT) KAKENHI grant 18H05511, 23H04846, 18H05508
Description of the data and file structure
/alpha_simulation/5pop_gf.est
estimation file defining the parameters used in fastsimcoal2.
/alpha_simulation/5pop_gf.tpl
template file defining the demographic model of house mouse subspecies used in fastsimcoal2
/alpha_simulation/5pop_gf_MSFS.obs
observed minor site frequency spectrum (MSFS) file of house mouse subspecies used in fastsimcoal2
/alpha_simulation/alpha_simulation.py
python script to generate 20-kbp-length genomic fragments from samples using the maximum-likelihood estimators and estimate alpha 100,000 times
/sex_ratio_distortion_simulation/XYsimulation.py
python script to be used as module in simulation_main.py and perform sex-ratio distortion simulation with gene flow
/sex_ratio_distortion_simulation/simulation_main.py
main wrapper python script to simulate sex-ratio distortion using 7 parameters from command line. The paramters are, RAA, RXA, RMA, RXY, alpha, Nm, r.
/sex_ratio_distortion_simulation/simulation_repeat.py\
sub wrapper python script to simulate sex-ratio distortion using 7 parameters from command line. The paramters are, RAA, RXA, RMA, RXY, alpha, Nm, r.
/sex_ratio_distortion_simulation/results\
contains result files used in the manuscript.
Sharing/Access information
Links to other publicly accessible locations of the data:
- to be updated
Data was derived from the following sources:
- PRJDB11969, PRJDB11027, PRJEB9450, PRJEB11742, PRJEB14167, PRJEB2176, PRJDB16017
Code/Software
The scripts were written in python3.
Dependency packages are follows:
matplotlib
numpy
msprime
tskit
pandas
IPython
allel (https://github.com/cggh/scikit-allel)