Data from: The XPF-like domain in SHOC1 required for homologous recombination and safeguarding autosome from meiotic silencing of unsynapsed chromatin
Data files
May 19, 2026 version files 279.78 GB
-
Hi-C_Shoc1_KI_R1.fastq.gz
138.05 GB
-
Hi-C_Shoc1_KI_R2.fastq.gz
141.43 GB
-
README.md
3.44 KB
-
SC_shoc1_ctrl_barcodes.tsv.gz
64.33 KB
-
SC_shoc1_ctrl_features.tsv.gz
304.32 KB
-
SC_shoc1_ctrl_matrix.mtx.gz
145.40 MB
-
SC_shoc1_KI_barcodes.tsv.gz
57.24 KB
-
SC_shoc1_KI_features.tsv.gz
304.32 KB
-
SC_shoc1_KI_matrix.mtx.gz
157.32 MB
Abstract
During meiosis, ZMM proteins play essential roles in stabilizing the recombination intermediates and promoting crossover (CO) formation. In mice, SHOC1 forms a trimeric complex with the other two ZMM proteins, SPO16 and TEX11, to bind recombination intermediates after strand invasion. Although genetic variants of SHOC1 are clinically associated with male infertility, their conserved functions in human gametogenesis remain enigmatic. Here, we delineated species-specific divergences between human and mouse SHOC1 complex, and identified a missense variant within the XPF-like domain in SHOC1 (p.Q590R). This variant impaired DNA double-strand breaks repair by compromising its ability to bind branched DNA structures and the recruitment of crucial proteins to recombination intermediates, ultimately abolishing CO formation. Furthermore, the variant disrupted dynamic chromatin structure in pachytene spermatocytes and induced synapsis defects. Importantly, the XPF-like domain in SHOC1 was revealed to prevent autosome intrusion into the sex body compartment, thereby protecting critical autosomal loci from meiotic silencing of unsynapsed chromatin (MSUC). Overall, our study underscores the critical role of the XPF-like domain in human SHOC1 in CO formation and in protecting autosomes from MSUC.
Dataset DOI: 10.5061/dryad.3n5tb2rvx
Summary of Dataset
This dataset contains raw and processed sequencing data generated for the associated study investigating the role of the XPF-like domain of human SHOC1 in meiotic crossover formation and in protecting autosomes from meiotic silencing of unsynapsed chromatin (MSUC). Two complementary sequencing modalities are included:
Hi-C sequencing, to investigate three-dimensional chromatin architecture and meiotic chromosome organization.
Single-cell RNA sequencing (scRNA-seq) of mouse testicular tissue, to profile transcriptomic changes across spermatogenic cell populations.
Description of the Data and File Structure
1. Naming Conventions
- Hi-C_ — Hi-C sequencing files.
- SC_ — single-cell RNA-seq files.
- Shoc1_ctrl — SHOC1 control (wild-type) sample.
- Shoc1_KI — SHOC1 XPF-like domain Knock-In sample.
- R1 — Read 1
- R2 — Read 2
2. File List and Descriptions
| File Name | Format | Description |
|---|---|---|
| Hi-C_Shoc1_KI_R1.fastq.gz | FASTQ (GZ) | Raw Hi-C sequencing reads (Read 1) for the Shoc1 KI sample. |
| Hi-C_Shoc1_KI_R2.fastq.gz | FASTQ (GZ) | Raw Hi-C sequencing reads (Read 2) for the SHOC1 KI sample. |
| SC_shoc1_ctrl_barcodes.tsv.gz | TSV (GZ) | List of cell barcodes identified in the Control scRNA-seq sample. |
| SC_shoc1_ctrl_features.tsv.gz | TSV (GZ) | List of genes/features (Ensembl gene IDs and symbols) for the Control sample. |
| SC_shoc1_ctrl_matrix.mtx.gz | MTX (GZ) | Sparse matrix containing gene expression UMI counts for the Control sample. |
| SC_shoc1_KI_barcodes.tsv.gz | TSV (GZ) | List of cell barcodes identified in the Shoc1 KI scRNA-seq sample. |
| SC_shoc1_KI_features.tsv.gz | TSV (GZ) | List of genes/features for the Shoc1 KI sample. |
| SC_shoc1_KI_matrix.mtx.gz | MTX (GZ) | Sparse matrix containing gene expression UMI counts for the SHOC1 KI sample. |
Code/software
Hi-C data were processed using the Juicer pipeline (v1.6) with the mm10 reference genome. The resulting .hic files were converted to .cool format using hic2cool. Downstream analyses were performed with the cooltools package (v0.7.0), including contact probability (P(s)) calculation, compartment (eigenvector) analysis, and insulation score computation. TAD boundary pileups and COs-DSBs region pileups were performed using coolpup.py (v1.1.0) under local mode.
Processed scRNA-seq data: The .mtx.gz, barcodes.tsv.gz, and features.tsv.gz files were generated using the CellRanger pipeline. These files are ready for import into analysis software such as Seurat (R) or Scanpy (Python).
Missing data: No data points were intentionally omitted. Absent entries in the sparse matrix represent zero UMI counts.
Access information
Other publicly accessible locations of the data:
- WT Hi-C dataset: GEO GSE109344
Data was derived from the following sources:
- WT Hi-C data from GSE109344; SHOC1 KI Hi-C data generated in this study.
