This README file was generated on 17/07/2020 by Lina Cai ------------------- GENERAL INFORMATION ------------------- This README file accompanies the data release for Cai et al. (2020) Genome-wide association analysis of type 2 diabetes in the EPIC-InterAct study. Title of Data file 1: meta_result_EPICInterAct_T2D_GWAS_summary_stats_logit_n22326_08Jul2020_LCAI.txt.gz The file is in tab delimited TXT format. Each row contains the summary statistics for a single genetic variant from a fixed-effects genome-wide meta-analysis of type 2 diabetes across two genotyping arrays in the EPIC-InterAct study using logistic regression (adjusted for age, sex, study centre and the first 4 genetic principal components). Title of Data file 2: CoxReg_on_T2D_Mahajan370SNPs_available_in_InterAct_Jul2020_LCAI.txt The file is in tab delimited TXT format. Each row contains the summary statistics for a single genetic variant on incident type 2 diabetes in EPIC-InterAct using Prentice-weighted Cox regression, adjusted by age (as underlying time scale), sex, study centre and the first four genetic principal components. Information about funding sources or sponsorship that supported the collection of the data: The EPIC-InterAct is funded by the EU FP6 Programme [grant number Integrated Project LSHM_CT_2006_037197] ------------------ ACCESS INFORMATION ------------------ Please do not use this data to attempt to identify individuals who participated in the study. -------------------------- METHODOLOGICAL INFORMATION -------------------------- Methods for processing the data (please refer to manuscript for complete methods): - Genome-wide associations were assessed in each genotyping platform among participants from the EPIC-InterAct study using logistic regression models, adjusted for age, sex, study centre and the first 4 principle components. - All individuals included were unrelated and of European ancestry. - SNPs with minor allele frequency (MAF) > 0.5%, imputation information score > 0.4, Hardy-Weinberg Equilibrium p value > 1x10-6 and association effect standard error < 10 from each genotyping platform were included in the meta-analysis. - Inverse-variance based meta-analysis of the results from two GWAS was performed using METAL - Top signals of known T2D loci were selected and tested for their effect estimates using Prentice-weighted Cox-regression model, and compared with results from logistic regression model. - The effect estimates were in high concordance and the correlation coefficient was 0.98. Software used to analyse the data: - QUICKTEST V0.98 was used to perform the genome-wide association analyses. - METAL was used to perform the inverse-variance weighted meta-analysis across GWAS results from the two genotyping arrays. - STATA 14 and R 3.4.0 were used to perform additional relevant analyses. -------------------------- DATA-SPECIFIC INFORMATION -------------------------- Title of Data file 1: meta_result_EPICInterAct_T2D_GWAS_summary_stats_logit_n22326_08Jul2020_LCAI.txt.gz Number of variables: 15 variables/columns included in the data file Number of rows: 8,924,492 rows (one genetic variant on each row) Column list: rsID = reference SNP cluster ID based on the HRC reference panel MarkerName_hg19 = composed of chromosome_position_MajorAllele_MinorAllele; reference genome GRCh37 (hg19) Allele1 = effect allele Allele2 = other allele Freq1 = Effect Allele Frequency Effect = Effect estimate from logistic regression -- log(OddsRatio) StdErr = Standard error of the effect estimate from logistic regression P.value = p value of the meta-analysis association Direction = Direction of effect for the effect allele in each genetic analysis of two genotyping platforms HetISq, HetChiSq, HetDf, HetPVal = Heterogeneity measures from meta-analysis across the two genotyping platforms inhte EPIC-InterAct TotalSampleSize = total sample size included in the analyses EffSampleSize = Effective Sample Size calibrated for case-control study design using formula provided by METAL N_eff = 4/(1/N_case+1/N_control) Specialized formats or other abbreviations used: The list of SNPs was ranked by ascending P.value and then by alphabetic order of rsID Title of Data file 2: CoxReg_on_T2D_Mahajan370SNPs_available_in_InterAct_Jul2020_LCAI.txt Number of variables: 15 variables/columns included in the data file Number of rows: 370 rows (one genetic variant on each row) Column list: MarkerName_hg19 = composed of chromosome_position_MajorAllele_MinorAllele; reference genome GRCh37 (hg19) rsID = reference SNP cluster ID based on the HRC reference panel chr = chromosome pos = position effect_allele other_allele freq = Effect Allele Frequency beta = Effect estimate from Prentice-weighted Cox Regression -- log(HazardRatio) se = Standard error of the effect estimate from logistic regression HR = Hazard Ratio HR_95CI_lb = lower bound of 95% Confidence Interval for HR HR_95CI_ub = upper bound of 95% Confidence Interval for HR p = p value of the effect estimate info = imputation quality statistic n = sample size ----------------------------------- Contact Information (Name // Email) ----------------------------------- For any queries, please contact Prof. Nicholas Wareham // nick.wareham@mrc-epid.cam.ac.uk