Skip to main content

Validation of the Swedish diabetes regrouping scheme in adult-onset diabetes in China

Cite this dataset

Zhou, Zhiguang et al. (2020). Validation of the Swedish diabetes regrouping scheme in adult-onset diabetes in China [Dataset]. Dryad.



Background  Re-classification of diabetes is vital in providing precise management and reducing risk of diabetes complications. This study aimed to validate the practicality of the Swedish diabetes re-grouping scheme in Chinese adults with newly diagnosed diabetes. We conducted a cross-sectional survey of 15772 patients with adult-onset newly-diagnosed diabetes in China from April 2015 to October 2017. Cluster analysis used by the Swedish study was employed to re-group our patients. Glutamate decarboxylase antibodies (GADA), age onset, body mass index (BMI), Hemoglobin A1c (HbA1c), homoeostatic model assessment 2 estimates of β-cell function (HOMA2-B) and insulin resistance (HOMA2-IR) were used to perform the TwoStep and k-means clustering. Characteristics of the clusters were compared between the patients from this study and those from the Swedish study.

Results Our patients clustered into five subgroups: 6.2% were gathered in the severe autoimmune diabetes (SAID) subgroup, 24.8% were in the severe insulin deficient diabetes (SIDD) subgroup, 16.6% were in the severe insulin resistance diabetes (SIRD) subgroup, 21.6% were in the mild obesity-related diabetes (MOD) subgroup and 30.9% were in the mild age-related diabetes (MARD) subgroup. When compared with the Swedish population, the proportion of SIDD subgroup was higher. In general, Chinese patients had younger age, lower BMI, higher HbA1c, lower HOMA2-B and HOMA2-IR, and higher insulin use but lower metformin usage than the Swedish patients. 

Content The data contains the figures and tables to describe the characteristics and the variable distributions of Chinese diabetic patients in each of the five clusters. Additionally, the comparisons between the Chinese and Swedish patients with diabetes were  presented in it as well.

Conclusion The Swedish diabetes regrouping scheme is applicable to adult-onset diabetes in China, with a high proportion of patients with the severe insulin deficient diabetes. Further validations of long-term diabetes complications remain warranted in future studies.


Study population

We conducted a nationally, multicenter, cross-sectional survey from April 2015 to October 2017. A total of 46 tertiary care hospitals in 24 major cities across mainland China agreed to participate in the survey. Outpatients with diabetes were consecutively recruited from departments of endocrinology of the participating hospitals during the survey period. The study protocol was approved by the ethics review committees/institutional review boards of the participating hospitals or centers. The fieldwork was conducted in accordance with the World Medical Association’s Declaration of Helsinki and written informed consent was obtained from all the participants before data collection. The inclusion criteria were as follows: 1) diagnosed with diabetes by the World Health Organization (WHO) 1999 criteria (1); 2) age of onset ≥ 18 years; 3) diabetes duration < 1 year. The exclusion criteria were as follows: 1) pregnancy or gestational diabetes mellitus (GDM); 2) acute diseases that would interfere with the glucose metabolism; 3) malignancies; 4) clinically suspected of being diagnosed as other specific types of diabetes. Demographic, anthropometric, medical history and lifestyle information were collected by trained research nurses, with a uniform questionnaire by face-to-face interviews. Medication information was retrieved from medical records.


After overnight fasting, venous blood samples were drawn for measurement of fasting plasma glucose (FPG), HbA1c and lipids (triglycerides (TGs), high density lipoprotein cholesterol (HDL-C), low density lipoprotein cholesterol (LDL-C)), fasting C-peptide (FCP) locally by standard methods. Two-hour postprandial plasma glucose (PPG) and C-peptide (PCP) were assayed using postprandial blood samples. GAD antibody (GADA) assays were performed at the core laboratory (Central South University) by radioligand assays, measured in duplicates as previously described (2). The cut-off value of ≥ 18 units/ml was defined as GADA positive, which was derived from the 99th percentile of 405 healthy subjects and a cut-off value of ≥ 180 units/ml was defined as high-titer GADA as previously reported (2). The estimates of beta cell function (HOMA2-B) and insulin resistance (HOMA2-IR) were calculated by the updated Homeostasis Model Assessment (HOMA2) calculator (3), based on FCP and FPG values.

Estimation of cardiovascular risks with Framingham Risk Score (FRS)

The Framingham algorithm has been validated for prediction of CVD in the Chinese population with adequate performance (4). This study used this score to calculate the 10-year risk of developing cardiovascular diseases or its components (i.e., coronary heart disease, cerebrovascular disease, peripheral vascular disease and heart failure) based on multivariable risk factors including sex, age, HDL-C, total cholesterol, systolic blood pressure (SBP), anti-hypertension treatment, smoking and diabetes status (5).

Cluster analysis

We replicated the two clustering methods utilized by Ahlqvist et al (6). First, five continuous variables, i.e., body mass index (BMI), age at onset of diabetes, HbA1c, HOMA2-B and HOMA2-IR and a binary variable denoting the presence or absence of glutamic acid decarboxylase antibody (GADA) were chosen as model variables. Cluster analysis was conducted on values centered on a mean value of 0 and a standard deviation (SD) of 1. Men and women were separately clustered to avoid sex associated stratification. Patients with extreme outliers (>5 SDs from the mean) were excluded. TwoStep clustering was performed in SPSS version 25 for 2 to 15 clusters using log-likelihood as a distance measure and Schwarz’s Bayesian criterion for clustering. K-means clustering was performed with a k value of 4 using the KMeans function (n_init=100, max_iter=1000) in the sklearn package in Python version 3.6.1. Only individuals with negative GADA were included. Cluster stability was assessed by Jaccard similarities, which use the clusterboot function (B = 2000, bootmethod = ‘boot’, clustermethod = kmeansCBI, k = 4) of the fpc package in R version 3.5.2, to the original cluster, in which values greater than 0.75 should be regards as stable.

T-distributed stochastic neighbor embedding (t-SNE using the TSNE function (n_components = 2) in the sklearn package in Python version 3.6.1) (7) was used to visualize the five clusters in 2D layouts to verify the clustering results.

Statistical analysis

SPSS 25.0 software (IBM Corporation, New York, NY, USA) was used in the data analysis unless otherwise specified. Continuous data were shown as means ± SDs for normal distribution, or as medians (interquartile range). Categorical variables were expressed as numbers (percentages). Variables comparisons among groups were performed using analysis of variance (ANOVA), and Dunnett's test with cluster 5 as the reference group was used to adjust for multiple comparisons. The Mann-Whitney U test was used for variables with normality being rejected. χ2 test or Fisher’s exact test was conducted to compare differences in categorical variables among groups as appropriate and patient cluster distributions between our cohort and the Swedish cohort.



1. Alberti KG and Zimmet PZ. Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: diagnosis and classification of diabetes mellitus provisional report of a WHO consultation. Diabet Med. 1998;15(7):539-53.

2. Huang G, Yin M, Xiang Y, Li X, Shen W, Luo S, Lin J, Xie Z, Zheng P and Zhou Z. Persistence of glutamic acid decarboxylase antibody (GADA) is associated with clinical characteristics of latent autoimmune diabetes in adults: a prospective study with 3-year follow-up. Diabetes Metab Res Rev. 2016;32(6):615-22.

3. Collaboration NCDRF. Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4.4 million participants. Lancet (London, England). 2016;387(10027):1513-1530.

4. Liu J, Hong Y, D'agostino RB, Sr., Wu Z, Wang W, Sun J, Wilson PW, Kannel WB and Zhao D. Predictive value for the Chinese population of the Framingham CHD risk assessment tool compared with the Chinese Multi-Provincial Cohort Study. Jama. 2004;291(21):2591-9.

5. D'agostino RB, Sr., Vasan RS, Pencina MJ, Wolf PA, Cobain M, Massaro JM and Kannel WB. General cardiovascular risk profile for use in primary care: the Framingham Heart Study. Circulation. 2008;117(6):743-53.

6. Ahlqvist E, Storm P, Karajamaki A, Martinell M, Dorkhan M, Carlsson A, Vikman P, Prasad RB, Aly DM, Almgren P, Wessman Y, Shaat N, Spegel P, Mulder H, Lindholm E, Melander O, Hansson O, Malmqvist U, Lernmark A, Lahti K, Forsen T, Tuomi T, Rosengren AH and Groop L. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. Lancet Diabetes Endocrinol. 2018;6(5):361-369.

7. Hinton LVDMG. Visualizing Data using t-SNE. J Mach Learn Res. 2008;9(2579–2605).