PRDM9-mediated reproductive isolation was first described in the progeny of Mus musculus musculus (MUS) PWD/Ph and Mus musculus domesticus (DOM) C57BL/6J inbred strains. These male F1-hybrids fail to complete chromosome synapsis and arrest meiosis at prophase I, due to incompatibilities between the Prdm9 gene and hybrid sterility locus Hstx2. We identified fourteen alleles of Prdm9 in Exon 12, encoding the DNA-binding domain of the PRDM9 protein in outcrossed wild mouse populations from Europe, Asia, and the Middle East, eight of which are novel. The same Prdm9 allele was found in all mice bearing introgressed t-haplotypes, encompassing Prdm9 and inversions preventing recombination with wildtype Chr 17. We asked whether seven novel Prdm9 alleles in MUS populations and the t-haplotype allele in one MUS and three DOM populations induce Prdm9-mediated reproductive isolation. The results show that only combinations of the dom2 allele of DOM origin and the MUS msc1 allele ensure complete infertility of intersubspecific hybrids outside the context of inbred mouse strains. The results further indicate that the erasure of PRDM9 msc1 binding motifs may be shared by MUS mice from populations with different Prdm9alleles, implicating that erased PRDM9 binding motifs may be uncoupled from their corresponding PRDM9 zinc finger arrays at the population level. Our data corroborate the model of Prdm9-mediated hybrid sterility beyond inbred strains of mice and suggest that sterility alleles of Prdm9 may be rare.
Data from: Natural variation in the zinc-finger-encoding exon of Prdm9 affects hybrid sterility phenotypes in mice
Data files
Oct 30, 2024 version files 405.83 MB
-
AsynapsisData.txt
16.99 KB
-
Fertility_Data.xls
649.22 KB
-
pwmscan_mm10_cst01.bed
21.46 MB
-
pwmscan_mm10_dom02.bed
8.67 MB
-
pwmscan_mm10_dom03.bed
9.14 MB
-
pwmscan_mm10_dom04.bed
8.84 MB
-
pwmscan_mm10_dom05.bed
10.08 MB
-
pwmscan_mm10_dom06.bed
18.66 MB
-
pwmscan_mm10_dom07.bed
14 MB
-
pwmscan_mm10_dom08.bed
10.08 MB
-
pwmscan_mm10_dom09.bed
19.24 MB
-
pwmscan_mm10_dom10.bed
8.46 MB
-
pwmscan_mm10_dom11.bed
9.39 MB
-
pwmscan_mm10_dom12.bed
17.45 MB
-
pwmscan_mm10_HumB.bed
3.75 MB
-
pwmscan_mm10_mmt01.bed
5.46 MB
-
pwmscan_mm10_msc01.bed
15.52 MB
-
pwmscan_mm10_msc02.bed
19.20 MB
-
pwmscan_mm10_msc03.bed
17.38 MB
-
pwmscan_mm10_msc04.bed
20.51 MB
-
pwmscan_mm10_msc05.bed
9.12 MB
-
pwmscan_mm10_msc06.bed
11.72 MB
-
pwmscan_mm10_msc07.bed
8.73 MB
-
pwmscan_mm10_msc08.bed
43.17 MB
-
pwmscan_mm10_msc09.bed
24.31 MB
-
pwmscan_mm10_msc10.bed
11.94 MB
-
pwmscan_mm10_msc11.bed
33.03 MB
-
pwmscan_mm10_msc12.bed
25.83 MB
-
README.md
21.33 KB
Abstract
Title of Dataset: Natural variation in Prdm9 affecting hybrid sterility phenotypes
Author/Principal Investigator Information
Name: Dr. Linda Odenthal-Hesse
ORCID:0000-0002-5519-2375
Institution: Max Planck Institute for Evolutionary Biology, Pln
Address: August-Thienemann Str. 2 , 24306 Ploen, Germany
Email: odenthalhesse@evolbio.mpg.de
Author/Associate or Co-investigator Information
Name: Professor Jiri Forejt
ORCID:0000-0002-2793-3623
Institution: Laboratory of Mouse Molecular Genetics, Institute of Molecular Genetics, Czech Academy of Sciences
Address: IMG Biocev, Prmyslov 595, 252 50 Vestec, Czech Republic
Email: jiri.forejt@img.cas.cz
Author/Alternate Contact Information
Name: Dr. Khawla Fathi Nefe Abu Alia
ORCID:0000-0002-7534-4014
Institution: Max Planck Institute for Evolutionary Biology, Pln
Address: August-Thienemann Str. 2 , 24306 Ploen, Germany
Email: abualia@evolbio.mpg.de
Date of data collection: September 5th, 2016 - December 12th, 2022
Geographic location of data collection: Ploen, Germany, 54.1613 N, 10.4259 E and Vestec, Czech Republic 49.9805 N, 14.5049 E
Information about funding sources that supported the collection of the data: the Max Planck Society, the DFG (grant No. OD112/1-1 to LOH), the DAAD (57334341 to JF and LOH), and the Czech Science Foundation (grant No. 20-04075S to JF) provided funding for this project
DATA & FILE OVERVIEW
File List:
- Fertility_Data: table with genotyping information sorted by wild mouse Prdm9 allele. Each sheet contains several crossing schemes that are outlined on each sheet, and matched fertility phenotyping and genotyping information for Prdm9, Hstx2 and t-haplotype markers.
- Asynapsis_Data.txt Collection of Table with raw counts of chromosomal asynapsis data used to determine the percentage of asynaptic cells
- Genome-wide prediction of DNA binding of each PRDM9 variant, computed using PWMScan by Giovanna Ambrosini et al. https://doi.org/10.1093/bioinformatics/bty127
Additional related data collected that was not included in the current data package: fasta files and electropherograms of resequencing data as well as microsatellite trace files and AFLP traces are available upon request.
Are there multiple versions of the dataset? No
DATA-SPECIFIC INFORMATION FOR: BED files: pwmscan_mm10_cst01.bed, pwmscan_mm10_HumB.bed, pwmscan_mm10_dom02.bed, pwmscan_mm10_dom03.bed, pwmscan_mm10_dom04.bed, pwmscan_mm10_dom05.bed, pwmscan_mm10_dom06.bed, pwmscan_mm10_dom07.bed, pwmscan_mm10_dom08.bed, pwmscan_mm10_dom09.bed, pwmscan_mm10_dom10.bed, pwmscan_mm10_dom11.bed, pwmscan_mm10_dom12.bed, pwmscan_mm10_mmt01.bed, pwmscan_mm10_msc01.bed, pwmscan_mm10_msc02.bed, pwmscan_mm10_msc03.bed,, pwmscan_mm10_msc04.bed, pwmscan_mm10_msc05.bed, pwmscan_mm10_msc06.bed, pwmscan_mm10_msc07.bed, pwmscan_mm10_msc08.bed, pwmscan_mm10_msc09.bed, pwmscan_mm10_msc10.bed, pwmscan_mm10_msc11.bed, pwmscan_mm10_msc12.bed
BED format is a text file format used to store genomic regions as coordinates and annotations.
METHODOLOGICAL INFORMATION
The supplied BED files are the output data files of the Genome-wide prediction of DNA binding of each PRDM9 variant, computed using PWMScan by Giovanna Ambrosini et al. https://doi.org/10.1093/bioinformatics/bty127. Predictions were run with the following input formats, Target Databases or Sequence Sets: Genome Assemblies Mus musculus (March 2012, GRCm38/mm10), with the Weight Matrix supplied as a Custom Weight Matrix, In the field “Paste Matrix”, we supplied the transformed Positional Weight Matrices output of the PRDM9 DNA binding motifs (reverse complements) computed using the Polynomial Kernel Method by Persikov et al. (2009) and Persikov and Singh (2014) on translated nucleotide sequences of alleles in this study, and in the study of Mukaj et al. (2020). The BED files are named as following: pwmscan (for the method), mm10 (for the Genome Assembly used) and with the name of the PRDM9 variant/allele name assigned by the International Committee of Standardized Genetic Nomenclature for mice (MGI), or the name of the GENBANK annotation for the human variant B.
For example pwmscan_mm10_msc12.bed is the BED output file of PWMScan, on the mm10 Reference Genome, for mouse PRDM9 variant msc12, pwmscan_mm10_HumB.bed is the BED output file of PWMScan on the mm10 Reference Genome for human PRDM9 variant B.
DATA-SPECIFIC INFORMATION FOR: Fertility_Data.xlsx
People involved with sample collection, processing, analysis and/or submission: All mice individuals were assigned a unique identifier number by the mouse house team led by Christine Pfeifle. Heike Harre determined body weight, weight of paired testes and performed sperm counting. Nicole Thomsen performed genotyping of sperm count slurry samples with the help of Tugce Cimen, Christina Nimke, Olga Eitel, Tjorben Nawroth, Florian Schroeder, Meri Nehlsen, Lukas Krueger and Laurin Seeger. Khawla Fathi Nefe Abu Alia generated the attached dataset by analysing the Prdm9 sequencing information, microsattelite tracks of Hstx2 markers and t-haplotype AFLP genotyping, and matched the phenotyping information obtained by Heike Harre to this genotyping information using the unique identifiers of each mouse.
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data: Fertility parameters were measured in control and hybrid males at 60 120 days of age. We collected three quantitative phenotypes: body weight (BW) and paired testes weight (TW), and counted Spermatozoa isolated from epididymal tissues in Million (SC). One whole epididymis, including caput, corpus, and cauda, was placed and cut in 1 ml of cold phosphatebuffered saline to obtain spermatozoa. The tube was vigorously shaken for 2 minutes and released spermatozoa diluted at 1:40 in PBS. We counted 10 l of diluted spermatozoa in a Brker chamber (0,1 mm chamber height), where two replicates of 25 squares were counted. To approximate spermatozoa released from a pair of epididymides, we added the two replicated 25 squares counts (A25 + B25) from spermatozoa released from a single epididymis.
All F1 and F2 hybrid offspring used in the experiments were instead genotyped from the counted sperm sample taken from one of the epididymides. The slurry of isolated spermatozoa with epididymal tissues was processed with Proteinase K and 1% SDS with 0.01 M TCEP (Thermo Scientific 77720, 0.5 M). This results in a mixture of DNA that comes from somatic cells as well as sperm cells.
Prdm9 genotyping: The ZNF arrays of each mouse were PCR amplified similarly as in (BUARD et al. 2014) on 10-30 ng of genomic DNA in 12 l reactions of the PCR buffer AJJ from (JEFFREYS et al. 1990) using a two-polymerase system with Thermo Taq-Polymerase (EP0405) and Stratagene Pfu Polymerase (600159) to ensure d high-fidelity PCR. When offspring are heterozygous for two alleles of different lengths (in most cases), we separated heterozygous bands after gel electrophoresis on Low Melting agarose (Thermo Fischer #R0801) by excising the bands and eluting the DNA using Agarase (Thermo Fischer #EO0461). If two heterozygous bands were apparent, excised and eluted product was immediately used in Sequencing reactions after estimating the amount of DNA from the gel. If only one band was evident, alleles were not separated by size. Therefore, the purified PCR products were cloned using TOPO TA Cloning Kit for Sequencing (Life Technologies no. 450030), following the manufacturers specifications before sequencing. We analyzed at least eight clones per sample.
Sequencing reactions of either eluted PCR product or picked clones were set up using BigDye 3.0, according to the manufacturers protocol, then purified using X-terminator, and finally sequenced using 3130x Genetic Analyser. Only PRDM9 variants with less than 12 ZNFs (sequence lengths of >1200 bp) could be sequenced up to their ends in both directions. We assembled the forward and reverse sequences based on the estimates of fragment sizes from PCR products on gels to achieve accurate assembly of the ZNF coding minisatellite using Geneious, Software (Version 10.2-11).
Genotyping of t-haplotypes and X-chromosomal haplotypes: The presence of the t-haplotype was tested using markers Tcp1 and Hpa_4ps (PLANCHART et al. 2000), and X-chromosomal haplotypes across the refined Hstx2 interval were tested using primers from (LUSTYK et al. 2019). Each forward primer was labeled with either HEX or FAM and amplified using the ABI Multiplex Kit according to the manufacturers protocol using primers from (PLANCHART et al. 2000).
Methods for processing the data:
Instrument- or software-specific information needed to interpret the data: Fragment lengths were analyzed by capillary electrophoresis using a 3730 DNA Analyser with POP7 polymer. Allele sizes were scored and binned using the Microsatellite plugin in Geneious v.10.2.
Describe any quality-assurance procedures performed on the data: All genotyping was done so that the experimenter was blind to the matching fertility phenotypes. Furthermore, after successful mating (> 5 male offspring), we culled all F0 males from all crosses to confirm initial genotyping. We counted two replicates of 25 squares. In cases when only a few (<10) spermatozoa were found, additional dilutions were prepared and counted.
Number of variables: thirty-one (plus one field for comments)
Number of cases/rows: 653 (across 15 sheets)
Variable List:
(A) crossing scheme: inter subspecific, intrasubspecific or intersubspecific humanized.
(B) parental cross: with the line/strain information of the dam and sire
(C) Offspring IDs: 8-digit unique mouse identifier number
(D) DOB: date of birth in European short format DD/MM/YYYY
(E) age: age of the mouse at phenotyping in days
(F) Sac date: date of sacrifice and phenotyping
(G) maternal Prdm9: maternal Prdm9 allele determined by genotyping offspring
(H) Paternal Prdm9: paternal Prdm9 allele determined by genotyping offspring
(I) tcp1:
(J) Hba-4ps:
(K) Maternal genome:
(L) Paternal genome
(M) Hstx2 origin::
(N) ChrX:65,100,392-65,100,563: microsatellite allele (fragment length) at marker SX65100
(O) ChrX:67,841,601-67,841,743: microsatellite allele (fragment length) at marker SR51
(P) chrX:69,084,174-69,084,417: microsatellite allele (fragment length) at marker SX69084
(Q) body weight (g) BW: body weight of mouse
(R) testes weight (g) TW: weight of paired testes
(S) TW/BW: Testes weight as a proportion of body weight in %
(T) average sperm count SC (M/ml): sperm count in Million per Milliliter
(U) sperm count x dilution factor: raw sperm count multiplied by the dilution factor
(V) dilution amount (l): amount of PBS into which the epidydimides were released
(W) Dilution factor: factor with which the releases spermatozoa were diluted before counting.
(X) paired 25 squares average count: average number of spermatozoa counted in 25 squares of a Brker chamber
(Y) how many epididymides counted: as some mice had only one testis, or if the second epididymus was used for a replicate count in Prague, the number of epididymides counted could be either one or two. This column denotes how many epididymides were used to release spermatozoa from.
(Z) single count 25 squares: two replicates of 25 squares of the Brker chamber were counted, this field denotes the count obtained in the first replicate
(AA) single count 25 squares replicate: two replicates of 25 squares of the Brker chamber were counted, this field denotes the count obtained in the second replicate
(AB) mother name: Name assigned to the mother of the phenotyped F1 hybrid mouse
(AC) mother ID: 8-digit unique mouse identifier number of the mother (dam) used in this cross
(AD) father ID: 8-digit unique mouse identifier number of the father (sire) used in this cross
(AE) grandfather ID: 8-digit unique mouse identifier number of the wild mouse grandfather (grand sire) used in this cross
(AF) comment: field for additional information
DATA-SPECIFIC INFORMATION FOR: Asynapsis_Final_submission.xlsx
People involved with sample collection, processing, analysis and/or submission: All mice individuals were assigned a unique identifier number by the mouse house team led by Christine Pfeifle. Emil Parvanov and Amisa Mukaj performed spreading of murine spermatocytes as well as staining for meiotic proteins, and determined the number of healthy cells, asynapsed cells and number of asynapis per asynapsed cell. Heike Harre determined body weight, weight of paired testes and performed sperm counting. Nicole Thomsen performed genotyping of sperm count slurry samples with the help of Tugce Cimen, Christina Nimke, Olga Eitel, Tjorben Nawroth, Florian Schroeder, Meri Nehlsen, and Lukas Krueger. Khawla Fathi Nefe Abu Alia generated the attached dataset, she analysed the Prdm9 sequencing information, microsattelite tracks of Hstx2 markers and t-haplotype genotyping, and matched the phenotyping information obtained by Heike Harre to this genotyping information using the unique identifiers of each mouse.
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data: Spermatocyte nuclei were spread for immunohistochemistry as described in (ANDERSON et al. 1999), with the following modifications. Firstly, a single-cell suspension of spermatogenic cells from the whole testis was prepared in 0.1M sucrose solution. The sucrose-cell slurry, with added protease inhibitors (Roche 11836153001), was then dropped onto paraformaldehyde-treated glass slides. Glass slides were kept in a humidifying chamber for 3 hours at 4C to allow cells to spread and fix. Slides were briefly washed in distilled water and transferred to pure PBS before blocking in PBS with 5-vol% goat serum. Primary antibodies HORMAD2 (a gift from Attila Toth, rabbit polyclonal antibody 1:700), SYCP3 (mouse monoclonal antibody, Santa Cruz, #74569, 1:50), yH2AX (ab2893. 1:1000), and CEN (autoimmune serum, AB-Incorporated, 15-235) were used for immunolabelling. Secondary antibodies goat anti-Mouse IgG-AlexaFluor568 (MolecularProbes, A-11031), goat anti-Rabbit IgG- AlexaFluor647 (MolecularProbes, A-21245), goat anti- Human IgG-AlexaFluor647 (MolecularProbes, A-21445), goat anti-Rabbit IgG-AlexaFluor488 (MolecularProbes, A- 11034) were used at 1:500 concentration at room temperature for one hour.
Instrument- or software-specific information needed to interpret the data: A Nikon Eclipse 400 microscope with a motorized stage control was used for image acquisition with a Plan Fluor objective, 60x (MRH00601). Images were captured with a DS-QiMc monochrome CCD camera and the NIS-Elements program (from Nikon). Image J software was used to process the images.
Describe any quality-assurance procedures performed on the data: All asynapsis phenotyping was done so that the experimenter was blind to the Prdm9 genotypes of the samples.
Number of variables: fifty-seven
Number of cases/rows: sixty-three
Variable List:
(A) crossing scheme: detailed description of the parental crossing scheme, with mothers (dam) named first, and fathers (sire) named second
(B) Mouse ID: 8-digit unique mouse identifier number
(C) DOB: date of birth in European short format DD/MM/YYYY
(D) age: age of the mouse at phenotyping in days
(E) Sac date: date of sacrifice and phenotyping
(F) maternal PRDM9: maternal Prdm9 allele determined by genotyping offspring
(G) Paternal PRDM9: paternal Prdm9 allele determined by genotyping offspring
(H) Chr17 haplotype: t/wt is heterozygous for the t-haplotype and wild type Chromosome 17.
(I) maternal genome: origin of maternal genome, either PWD = PWD/Ph strain, B6.DX1s = strain, B6 = C57Bl6/J, wildDOM (wild mouse Mus musculus domesticus), wildMUS (wild mouse Mus musculus musculus)
(J) paternal genome: origin of paternal genome, either B6 = C57Bl6/J, B6.DX1s = strain, wildDOM (wild mouse Mus musculus domesticus), wildMUS (wild mouse Mus musculus musculus)
(K) X-chromosomal haplotype: X-chromosomal haplotypes were either inferred in male offspring with known homozygous maternal X-chromosomal haplotype (i.e. PWD/Ph), or determined by genotyping offspring of mothers with heterozygous X-chromosomal haplotypes at three microsatellite markers SR51, SX6500 and SX69084
(L) body weight (g): body weight of mouse
(M) testes weight (g) weight of paired testes
(N) TW/BW: Testes weight as a proportion of body weight in %
(O) average sperm count SC (M/ml): sperm count in Million per Milliliter
(P) sperm count x dilution factor: raw sperm count multiplied by the dilution factor
(Q) dilution amount (l): amount of PBS into which the epidydimides were released
(R) Dilution factor: factor with which the releases spermatozoa were diluted before counting.
(S) paired 25 squares average count: average number of spermatozoa counted in 25 squares of a Brker chamber
(T) how many epididymides counted: as some mice had only one testis, or if the second epididymus was used for a replicate count in Prague, the number of epididymides counted could be either one or two. This column denotes how many epididymides were used to release spermatozoa from.
(U) single count 25 squares: two replicates of 25 squares of the Brker chamber were counted, this field denotes the count obtained in the first replicate
(V) single count 25 squares replicate: two replicates of 25 squares of the Brker chamber were counted, this field denotes the count obtained in the second replicate
(W) mother name: Name assigned to the mother of the phenotyped F1 hybrid mouse
(X) mother ID: 8-digit unique mouse identifier number of the mother (dam) used in this cross
(Y) father ID: 8-digit unique mouse identifier number of the father (sire) used in this cross
(Z) grandfather ID: 8-digit unique mouse identifier number of the wild mouse grandfather (grand sire) used in this cross
(AA) comments: mice with single testis, instead of two testes or additional counts performed in Prague are noted here
(AB) Number cells counted: Number of cells from spermatocyte spreads, which were assessed for asynapsis
(AC) Healthy cells number: Number of healthy appearing cells in spermatocyte spreads
(AD) Healthy cells %: The percentage of healthy cells from the total number of assessed cells from spermatocyte spreads
(AE) Total Number AS cells: The number of cells showing at least one chromosome asynapsis event, visualized by antibody staining for HORMAD2 protein
(AF) AS cells vs. total: The percentage of cells with at leaste one chromosomal asynapsis event, from the total number of assessed cells from spermatocyte spreads
(AG) 1: Number of times an asynapsis events was observed for a single homolog pair
(AH) 2: Number of times an asynapsis events was observed for two homolog pairs
(AI) 3: Number of times an asynapsis events was observed for three homolog pairs
(AJ) 4: Number of times an asynapsis events was observed for four homolog pairs
(AK) 5: Number of times an asynapsis events was observed for five homolog pairs
(AL) 6: Number of times an asynapsis events was observed for six homolog pairs
(AM) 7: Number of times an asynapsis events was observed for seven homolog pairs
(AN) 8: Number of times an asynapsis events was observed for eight homolog pairs
(AO) 9: Number of times an asynapsis events was observed for nine homolog pairs
(AP) 10: Number of times an asynapsis events was observed for ten homolog pairs
(AQ) 11: Number of times an asynapsis events was observed for eleven homolog pairs
(AR) 12: Number of times an asynapsis events was observed for twelve homolog pairs
(AS) 13: Number of times an asynapsis events was observed for thirteen homolog pairs
(AT) 14: Number of times an asynapsis events was observed for fourteen homolog pairs
(AU) 15: Number of times an asynapsis events was observed for fifteen homolog pairs
(AV) 16: Number of times an asynapsis events was observed for sixteen homolog pairs
(AW) 17: Number of times an asynapsis events was observed for seventeen homolog pairs
(AX) 18: Number of times an asynapsis events was observed for eighteen homolog pairs
(AY) 19: Number of times an asynapsis events was observed for nineteen homolog pairs
(AZ) 20: Number of times an asynapsis events was observed for twenty homolog pairs
(BA) 21: Number of times an asynapsis events was observed for twenty-one homolog pairs
(BB) 22: Number of times an asynapsis events was observed for twenty-two homolog pairs
(BC) Asynapsis per asynaptic cell: average number of homolog pairs involved in asynaptic events, across all asynaptic cells of the given individual
(BD) Asynapsis number per total cells: the proportion of asynaptic homolog pairs from the total number of cells analyzed for a given individual
(BE) % cells w/o sex body: the percentage of cells from the total number of cells, that do not have a fully formed sex body (cloud of HORMAD2 staining of the nonhomologous parts of XY sex chromosomes that are normally observed in physiologically progressing meiocytes)