Data from: Multicohort analysis unveils role for axon guidance pathways linking small for gestational age to spirometric restriction
Data files
Mar 03, 2026 version files 221.98 MB
-
README.md
3.86 KB
-
Sheep_data_clean_rm.rds
221.98 MB
Abstract
Children born small for gestational age (SGA) are at risk for the development of metabolic, cardiovascular, respiratory, and neurodevelopmental diseases and premature mortality, but the underlying mechanisms are unknown. We analyze blood proteomic data from several birth cohorts to elucidate molecular mechanisms related to SGA and subsequent lung function outcomes in later life. We found that one-third of children born SGA manifest a distinct molecular endotype characterized by axon guidance protein dysregulation in cord blood. In later life, peripheral blood, these proteins were inversely related to concurrent spirometric restriction. We obtained orthogonal evidence from GWAS data and an experimental sheep model to show axon guidance genes are associated with spirometry measurements (FEV1/FVC) at genome-wide significance and are commonly expressed during fetal development of multiple organs. Our findings provide new insights into the developmental origins of chronic diseases and pave the way for further study of axon guidance cues in multiorgan morbidity. This Dryad repository contains the scRNA-Seq dataset for the sheep model of the linked manuscript.
Dataset DOI: 10.5061/dryad.5mkkwh7h3
Description of the data and file structure
Overview
This repository accompanies the publication entitled "Multicohort analysis unveils role for axon guidance pathways linking small for gestational age to spirometric restriction", currently being prepared for submission. It contains the data generated from the Sheep model of fetal growth restriction.
Data
Proteomics
Proteomic data were generated with the SomaScan platform (SomaLogic, Boulder, CO, USA), which allows high-throughput, simultaneous quantification of thousands of proteins (1). The fluorescence-based detection of aptamer abundance is reported as Relative Fluorescent Units (RFU), which is proportional to the concentration in the sample. The term ‘aptamer’ refers to the short, single-stranded oligonucleotides that bind to target proteins with high affinity. Proteomics data were generated from cord blood and later life (LL) peripheral blood and are available on the GitHub repository associated with this manuscript.
Single cell RNA-Seq (scRNA-Seq)
scRNA-Seq experiments were conducted on fetal samples were collected from the brain, heart, and lung between 130 and 135 days of gestational age from pregnant ewes (Ovis aries, Columbia-Rambouillet breed) exposed to alternating cycles of elevated ambient temperature to induce fetal growth restriction.
Single-cell suspensions were sequenced with the 10x Genomics Universal 3' Gene Expression workflow. Cell Ranger (version 9.0.0) was used for pre-processing raw .fcs files using the Oar_rambouillet_v1.0 reference genome. Sample quality control and analysis was largely conducted via the Seurat (version 5.0.3) toolkit (2), and samples were integrated with Harmony (3)
The scRNA-Seq is organized as a Seurat object, saved as a .rds file, with raw counts layer of the RNA slot with relevant experimental/QC variables stored in the meta.data slot.
Code
Data QC and analysis for this study was predominantly conducted in the R (version 4.3.2) environment. R Markdown files, saved in html format, are available on the GitHub repository associated with this manuscript.
References
- Kraemer S, Schneider DJ, Paterson C, Perry D, Westacott MJ, Hagar Y, et al. Crossing the Halfway Point: Aptamer-Based, Highly Multiplexed Assay for the Assessment of the Proteome. J Proteome Res. 2024;23(11):4771-88.
- Hao Y, Stuart T, Kowalski MH, Choudhary S, Hoffman P, Hartman A, et al. Dictionary learning for integrative, multimodal, and scalable single-cell analysis. Nat Biotechnol. 2024;42(2):293-304.
- Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, et al. Fast, sensitive, and accurate integration of single-cell data with Harmony. Nat Methods. 2019;16(12):1289-96.
File: Sheep_data_clean_rm.rds
Description: Seurat object containing raw counts and relevant metadata
Variables (object metadata)
- orig.ident: Original identity
- nCount_RNA: Transcript count
- nFeature_RNA: Feature count
- percent.mt: Percent of mitochondrial genes
- Doublets_comb: Doublet/singlet status (all singlets)
- Sample: Sample name
- Lane: Sequence lane
- Sheep: Sheep donor
- Condition: Experimental condition (FGR/control)
- Organ: Organ the sample was generated from
Code/software
Data QC and analysis was predominantly conducted in the R environment (version 4.3.2). Specific packages used are shown in the markdowns provided on GitHub.
Access information
Other publicly accessible locations of the data:
- github.com/jamesfread/CADRE_manuscript
- Raw data files and any other associated data not included in this repository are available from the authors upon reasonable request.
