Skip to main content
Dryad

Causal effect of familial short stature on three quantitative traits in Taiwan

Cite this dataset

Chiou, Jian-Shiun et al. (2022). Causal effect of familial short stature on three quantitative traits in Taiwan [Dataset]. Dryad. https://doi.org/10.5061/dryad.r4xgxd28k

Abstract

Objectives: With the accumulation of genetic basis for familial (genetic) short stature (FSS), the genetic association of FSS with health-related outcomes remains to be elucidated. In this study, we aimed to investigate the FSS genetic architecture and its causal effect on three quantitative traits in Taiwan.

Methods: We conducted an FSS genome-wide association study (GWAS) analysis (1,640 FSS cases and 22,372 controls). We performed a GWAS meta-analysis for the Taiwanese meta-height from the Taiwan Biobank (TWB)_height (N = 67,452) and the China Medical University Hospital (CMUH)_height GWAS summary statistics (N = 88,854). We calculated three polygenic risk scores (PRSs) of SNPs (P < 5 x 10-8) for FSS and Taiwanese meta-height with/without FSS, respectively. We explored the associations between three PRSs and the measured height, respectively. We also performed Mendelian randomization (MR) analysis for the causal effect of FSS and Taiwanese meta-height with/without FSS, on anthropometric, bone mineral density (BMD), and female reproductive traits.

Results: FSS GWAS identified 172 SNPs in 4 genomic regions, reported in height (P < 5 x 10-8). Higher FSS genetic scores correlate with an increased risk of short stature and height reduction tendency (p < 0.001). The causal effect showed that a higher risk of FSS was associated with decreased body height, but increased body mass index, and body fat (p < 0.001). However, higher genetic scores of Taiwanese meta-height with/without FSS correspond with increased body height, body weight, hip circumference, and age at menarche, but decreased BMD_T-score, BMD_Z-score, and stiffness index (p < 0.001).

Conclusion: This study contributes to the FSS and height genetic features and their causal effects on three quantitative traits in individuals of Han Chinese ancestry in Taiwan. 

Methods

Study design: Figure 1 shows an overview of the study design. For a single GWAS discovery, a GWAS was performed to identify FSS susceptible loci, with 1,640 cases and 22,372 controls of Han Chinese ancestry in Taiwan.

For a multi-GWAS discovery, two-stage GWAS meta-analyses were conducted. In stage 1, the effect of each SNP was combined from the Taiwan Biobank (TWB)_height GWAS summary statistics (N = 67,452) and the CMUH_height GWAS summary statistics (N = 88,854) (not shown) using a fixed-effect meta-analysis of beta values. A Taiwanese meta-height GWAS summary statistics (N = 156,306) was then obtained. In stage 2, the effect of each SNP was also combined from the Taiwanese meta-height and our FSS GWAS summary statistics (Case: 1,640; control: 22,372) as previously described. A GWAS summary statistics of Taiwanese meta-height + FSS (N = 180,318) was also obtained.

GWAS summary statistics of FSS and Taiwanese meta-height with/without FSS were then applied in the evaluation cohorts. The evaluation cohorts included the FSS testing group (Case: 410; control: 5,594) and the TWB_height testing + validation groups (N = 28,909). These three GWAS summary statistics were applied to optimize polygenic risk score (PRS), to investigate the PRS associations with the risk of short stature and/or height tendency, and to perform the MR analysis in the anthropometric, BMD, and female reproductive traits.  

FSS GWAS study population: In this study, we selected 142,935 participants from CMUH, Taichung, Taiwan (Appendix Figure 1). Individuals who with the following criteria were excluded: (1) individuals who were included in our previous FSS genetic study (2) (N = 62), (2) individuals with skeletal dysplasia (ICD-9-CM code: 756.9) (N = 100), (3) individuals with dysmorphic feature (ICD-9-CM code: 738.19) (N = 88), (4) individuals with constitutional delay of growth and puberty (ICD-9-CM code: 259.0) (N = 3), (5) individuals with abnormal IGF-1 serum level within 1 year after FSS diagnosed (N = 0), (6) individuals with abnormal thyroid function within 1 year after FSS diagnosed (N = 375), and (7) individuals with abnormal puberty onset within 1 year after FSS diagnosed (N = 303). These resulted in 142,004 study subjects. The study subjects included 2,050 FSS cases and 139,954 controls.

The 2,050 FSS cases (ICD-9-CM code: 783.43 or ICD-10-CM code: R62.52) diagnosed by pediatric endocrinologists were individuals of Han Chinese ancestry in Taiwan. The 139,954 controls were further processed according to the exclusion criteria: (1) individuals with their age less than 18 years old (N = 10,644), (2) individuals without height information (N = 18,486), and (3) individuals with a height less than 75th percentile (N = 82,858). Finally, these resulted in 2,050 cases and 27,966 controls. Then we used a simple random sampling method to assign the training and testing groups at an 8:2 ratio. The training group (Case: 1,640; control: 22,372) comprised 80% of the total study population and was applied for the GWAS analysis (case-control study) under an additive genetic model, adjusted with gender and the first 10 principal component analyses (PCAs) (Appendix Figure 1, Appendix Figure 2, and Figure 2A), using the PLINK software (version 1.9, 2.0) (28). A significant P-value threshold of genome-wide association P < 5.00E-8 was used for the additive test.

The testing group (Case: 410; control: 5,594) comprised 20% of the total study population, served as one of the evaluation cohorts, was applied to optimize PRS and to investigate the PRS associations with risk of short stature using logistic regression analysis (Appendix Figure 3). The Human Studies Committee of CMUH in Taichung, Taiwan approved this study (approval number: CMUH107-REC3-074 and CMUH110-REC3-005).

Quality control of the genetic data: In this study, imputed GWAS data were extracted from CMUH, Taichung, Taiwan. The SNP quality control (QC) and individual QC procedures were performed before FSS GWAS analysis. The following SNPs were excluded in the SNP QC procedure: (1) SNPs with MAF < 0.001, (2) SNPs with HWE p-value of < 1 × 10-6, and (3) SNP with a missing call rate of > 5%. The resulting SNPs were used to perform ancestry PCA for the population structure analysis after SNP QC. Furthermore, the following individuals were excluded from the individual QC procedure: (1) individuals who did not fit the sex check: male: 0.75 and female: 0.25, (2) heterozygosity rate > ±3 standard deviation (SD), (3) individual with a missing call rate > 5%, (4) individuals with their divergent ancestry of PCA >±5 SD, (5) individual with kinship > 0.0884. DNA contamination, evidence of relatedness, or participants of non-Chinese ancestry, were excluded.

PRS calculation using the Bayesian polygenic prediction approach (PRS-CS): GWAS summary statistics of FSS and Taiwanese meta-height with/without FSS were used as the training data sets, respectively, for PRS-CS together with Asians in the 1000 Genomes Phase 3 project as the linkage disequilibrium reference panel. The PRSs were then calculated using Asian-specific posterior weights and PLINK software (versions 1.9 and 2.0) (28) in the evaluation cohorts (validation data set). The evaluation cohorts included the FSS testing group (Case: 410; control: 5,594) and the TWB_height testing + validation groups (N = 28,909).  

For the evaluation cohort- the FSS testing group (Case: 410; control: 5,594), FSS PRSs were generated according to the all SNPs and the SNPs of FSS GWAS summary statistics-associated P-value thresholds (P < 5 x 10-6, P < 5 x 10-7, and P < 5 x 10-8) using Asian-specific posterior weights and PLINK software (versions 1.9 and 2.0).

For the evaluation cohort- the TWB_height testing + validation groups (N = 28,909), PRSs of FSS, and Taiwanese meta-height with/without FSS were generated according to the SNPs of their GWAS summary statistics using Asian-specific posterior weights and PLINK software (versions 1.9 and 2.0). Data centering and standardization were performed for the PRS data.

PRS calculation using the clumping and P-value threshold approach: Similarity, FSS GWAS summary statistics were also used as the training data set for the clumping and P-value threshold approach. The SNPs of FSS GWAS summary statistics-associated P-value thresholds (P < 5 x 10-6, P < 5 x 10-7, and P < 5 x 10-8) were then subjected to the clumping procedure (within the range of 250,000 base pairs of the index SNP, where SNPs were removed when r2 > 0.1), according to the estimated linkage disequilibrium (LD) among the SNPs in the FSS testing group (Case: 410; control: 5,594). After clumping, FSS PRSs were generated according to the P-value thresholds (P < 5 x 10-6, P < 5 x 10-7, and P < 5 x 10-8 ) using PRSice and PLINK software (versions 1.9 and 2.0). Data centering and standardization were performed for the PRS data.

TWB phenotypic and genetic data: Database of TWB_height testing + validation groups (N = 28,909) including phenotypic and genetic data was also applied as one of the evaluation cohorts for the linear regression association between measured height (cm) and PRS and for MR analysis. In this study, the phenotypic data included anthropometric, BMD, and female reproductive quantitative traits. The anthropometric trait included body height (cm), body weight (kilogram), body mass index (BMI), waist circumference (WC) (cm), hip circumference (HC) (cm), waist-hip ratio (WHR), and body fat (%). Body mass index (BMI) was calculated as BMI = body weight/body height2. The WHR was calculated as WHR = WC/ HC. The BMD trait included BMD_T-score, BMD_Z-score, and stiffness index. The female reproductive trait included age at menarche (years old), age at menopause (years old), and reproductive life span (age at menopause minus age at menarche; years). The genetic data included the imputed GWAS data obtained from the TWB_height testing + validation groups (N = 28,909). The SNP QC and individual QC procedures were described as previously reported. 

Statistical analyses: Imputed genotype data was applied for the GWAS analysis (case-control study) under an additive genetic model, adjusted with gender and the first 10 PCAs, using the PLINK software (version 1.9, 2.0) (28). LocusZoom was used to plot the resulting top genetic loci (Appendix Table 1) (Appendix Table 2). A Manhattan plot was used to show the GWAS result (Figure 2).

Meta-analysis was performed in TWB_height GWAS (N = 67,452) and CMUH_height GWAS (N = 88,854) (not shown) summary statistics using GWAMA (fixed effect model). Furthermore, meta-analysis was also performed in Taiwanese meta-height GWAS (N = 156,306) and FSS GWAS (Case: 1,640; control: 22,372) summary statistics using GWAMA (fixed effect model).

For the association between measured height (cm) (phenotype) and genetically determined PRS, individuals with calculated PRS were used as continuous variables in a linear regression with their corresponding measured height (cm) as the dependent variable, respectively (Figure 3). The PRS was calculated according to the SNPs using Asian-specific posterior weights and PLINK software (versions 1.9 and 2.0). Data centering and standardization were performed for the PRS data.

For the beta-value association between Taiwanese meta-height and FSS, SNPs with beta-value in FSS were used as continuous variables in a linear regression with their corresponding beta-value in Taiwanese meta-height as the dependent variable, respectively (Appendix Figure 4). The beta-values between Taiwanese meta-height and FSS were from 198 lead SNPs of our meta-analysis result of Taiwanese meta-height + FSS (Table 1).

MR analysis was applied with three models- the Inverse variance weighted (IVW), Weighted median, and MR-Egger models (Table 2). The intercept, standard error (SE), and P-value for the MR-Egger intercept were used to show the directional pleiotropy. For Cochran’s heterogeneity test, the IVW Q-value, IVW P-value, Egger Q-value, and Egger P-value were applied to show the heterogeneity (Table 3). Beta-value, 95% confidence interval (CI), and P-value were expressed as the unit per one-SD increase in three PRSs, respectively. An MR package was performed in R software 3.6.0 (39). The Python, R packages, and PLINK software (versions 1.9 and 2.0) were used for all statistical analyses. 

Funding

China Medical University, Award: CMU-110-S-17

China Medical University Hospital, Award: DMR-111-062

National Science and Technology Council, Award: MOST 109-2320-B-039-035-MY3

China Medical University, Award: CMU-110-S-24

China Medical University, Award: CMU110-MF-115

China Medical University, Award: CMU110-MF-49

China Medical University, Award: CMU111-MF-105

China Medical University, Award: CMU111-SR-158

China Medical University, Award: CMU111-MF-21

China Medical University Hospital, Award: DMR-111-153

National Science and Technology Council, Award: MOST 111-2813-C-039-208-B

National Science and Technology Council, Award: MOST 111-2314-B-039-063-MY3

National Science and Technology Council, Award: MOST 111-2314-B-039-064-MY3

National Science and Technology Council, Award: MOST 111-2410-H-039-002-MY3