X-linked multi-ancestry meta-analysis reveals tuberculosis susceptibility variants
Data files
Jun 05, 2024 version files 62.97 MB
Abstract
Globally, tuberculosis (TB) presents with a clear male bias that cannot be completely accounted for by environment, behaviour, socioeconomic factors, or the impact of sex hormones on the immune system. This suggests that genetic and biological differences, which may be mediated by the X chromosome, further influence the observed male sex bias. The X chromosome is heavily implicated in immune function and yet has largely been ignored in previous association studies. Here we report the first multi-ancestry X chromosome specific meta-analysis on TB susceptibility. We identified X-linked TB susceptibility variants using seven genotyping data sets and 20,255 individuals from diverse genetic ancestries. Sex-specific effects were also identified in polygenic heritability between males and females along with enhanced concordance in direction of genetic effects for males but not females. These sex-specific genetic effects were supported by a sex-stratified and combined meta-analysis conducted using the X chromosome specific XWAS software and a multi-ancestry analysis using the MR-MEGA software. Seven significant associations were identified. Two in the overall analysis (rs6610096, rs7888114) and a second for the female specific analysis (rs4465088) including all data sets. For the ancestry specific meta-analysis three significant associations were identified for males in the Asian cohorts (rs1726176, rs5939510, rs1726203) and one in females for the African cohort (rs2428212). Several genomic regions previously associated with TB susceptibility were reproduced in this study, along with strong ancestry-specific effects. These results support the hypothesis that the X chromosome and sex-specific effects could significantly impact the observed male bias in TB incidence rates globally.
README: X-linked multi-ancestry meta-analysis reveals tuberculosis susceptibility variants
https://doi.org/10.5061/dryad.2z34tmpv5
For this X-chromosome specific sex-stratified meta-analysis multiple analysis were conducted. First, we did sex-stratified association analysis on all the individual datasets using the XWAS software and then the results were combined in multiple meta-analysis (also using XWAS). The results for the individual datasets are not available in this repository but can be requested through the corresponding authors in the published manuscript. The results for the meta-analysis are available in this repository. Multiple meta-analysis was conducted, a combined meta-analysis and a sex stratified meta-analysis, which were also further stratified by the source population subgroup. The following meta-analysis were conducted:
1. A combined meta-analysis using data across all populations for males and females.
2. A sex-stratified meta-analysis including data across all populations.
3. A combined meta-analysis for the Asian, Euroasian and African populations
4. Sex-stratified meta-analysis for the Asian, Euroasian and African populations.
The results from this analysis identified novel genetic variants with strong sex-specific effects. While previous X-linked associations were not duplicated in this study the analysis revealed associations in genomic regions that overlap with previous studies.
Description of the data and file structure
Files included in this repository:
1. Plink_male_female_combined_meta_analysis_all_cohorts.meta
a. Meta-analysis containing results from all datasets from males and females (not stratified by sex or ancestry).
b. Produced using PLINK software
2. XWAS_female_ALL_cohorts_meta_analysis.meta
a. Meta-analysis containing results from the females of all datasets. (not stratified by ancestry)
b. Produced using the XWAS software
3. XWAS_male_ALL_cohorts_meta_analysis.meta
a. Meta-analysis containing results from the males of all datasets. (not stratified by ancestry)
b. Produced using the XWAS software
4. XWAS_female_chinese_cohorts_meta_analysis.meta
a. Meta-analysis containing results from the females of all datasets of Asian ancestry.
b. Produced using the XWAS software
5. XWAS_male_chinese_cohorts_meta_analysis.meta
a. Meta-analysis containing results from the males of all datasets of Asian ancestry.
b. Produced using the XWAS software
6. XWAS_female_euroasian_cohorts_meta_analysis.meta
a. Meta-analysis containing results from the females of all datasets of Asian and European ancestry.
b. Produced using the XWAS software
7. XWAS_male_euroasian_cohorts_meta_analysis.meta
a. Meta-analysis containing results from the males of all datasets of Asian and European ancestry.
b. Produced using the XWAS software
8. XWAS_female_african_cohorts_meta_analysis.meta
a. Meta-analysis containing results from the females of all datasets of African ancestry.
b. Produced using the XWAS software
9. XWAS_male_african_cohorts_meta_analysis.meta
a. Meta-analysis containing results from the males of all datasets of African ancestry.
b. Produced using the XWAS software
The files contain the association testing results for all variants on the X chromosome.
For the meta-analysis output files the column descriptions are as follows:
CHR: Chromosome number ‘23’ representing the X chromosome
BP: The base pair position of the genetic variants (build 37 locations)
SNP: SNP names presented as ‘CHR:BP’
A1: Major allele
A2: Minor allele
N: Number of studies in the meta-analysis for each variant
P: P-value of the association testing
P(R): P-value of the residual
OR: Odds ratio of the association testing for each variant
OR(R): Odds ratio of the residual
Q: Cochran’s Q measure of heterogeneity, which is calculated as the weighted sum of squared differences between individual study effects and the pooled effect across studies
I: The I² statistic describes the percentage of variation across studies that is due to heterogeneity rather than chance
Sharing/Access information
The meta-analysis results can be downloaded from this repository. The original raw data and summary statistics of the association testing of the individual files are not available due to ethical and data sharing constraints. These files can be requested through the corresponding authors listed in the published manuscript.
Data was derived from the following sources:
- International Tuberculosis Host Genetics Consortium
Code/Software
Imputation of the individual data was done using the Impute2 software. Quality control of the data was done using PLINK and XWAS software. Association testing of the individual files and subsequent meta-analysis were also performed using the PLINK and XWAS software.
Methods
This analysis includes 7 of the 17 published (and unpublished) GWAS studies of TB (with HIV-negative cohorts) prior to 2022. It excludes data from Iceland and Vietnam, as they declined to share data. It excludes data from China, Korea, Peru, and Japan, as data-sharing agreements could not be finalized in time for this analysis. The Indonesian data was not suitable for reliable imputation, and the Moroccan data was family-based and thus also not suitable for this meta-analysis. Data from Thailand, Japan, Estonia and Germany were excluded as they did not have X chromosome genotyping. Finally, genotyped TB cases and controls are also available in the UK Biobank, but this data was not included in this analysis as genetic association studies on such highly selected datasets need to be undertaken with caution, and to not bias results, were excluded for this analysis.
Included individuals were genotyped on a variety of genotyping arrays, and raw genotyping data were available for eight datasets, and for the remainder, association testing summary statistics were obtained to use in the meta-analysis. Quality control (QC) and imputation of the data with raw genotyping information available was done using Plink (v1.9), followed by pre-phasing using SHAPEIT and Impute2 with the 1000 genomes phase 3 reference panel. QC and imputation were done as described previously; briefly we used a minor allele filter of 0.025 and an individual and SNP missingness filter of 0.1. Hardy–- equilibrium threshold was set at a Bonferroni corrected p-value according to the number of SNPs testes (0.05/number of SNPs) and samples where sex could not be determined from genotyping were also removed. Imputed data were filtered at a quality score of 0.3, prior to individual and genotype filtration steps. Prior to QC and imputation, allele orientation was corrected using Genotype Harmoniser version 1.4.15, and the genome build of all datasets was checked for consistency (GRCh37) and updated if necessary, using the liftOver software from the UCSC genome browser. The final imputed datasets again went through a QC process, but this time in a sex-stratified manner using the XWAS software and XWAS pipeline. Sex-stratified GWAS analysis of the individual datasets was then done using the XWAS software. The results from the sex-stratified GWAS were then combined in a sex-stratified and combined meta-analysis using the XWAS software. The results of the sex-stratified and combined meta-analysis are uploaded here.