Skip to main content

Dr (Colon Cancer)

Cite this dataset

Jiang, QunGuang; Fu, Xiaorui; Duanmu, Jinzhong; Li, Taiyuan (2019). Dr (Colon Cancer) [Dataset]. Dryad.


Colon adenocarcinoma (COAD) is the commonest colon cancer exhibiting high mortality. Due to the association with cancers progression, long noncoding RNAs (lncRNAs) become prognostic biomarkers. This study, using relevant clinic information and expression profiles of lncRNA originating in The Cancer Genome Atlas database, aims to construct a prognostic lncRNA signature to estimate the prognosis for patients. In the training cohort, prognosis related lncRNAs were selected from differently expressed lncRNAs by univariate Cox analysis. Furthermore, the least absolute shrinkage and selection operator (LASSO) regress and multivariate Cox analysis were employed for identifying prognostic lncRNAs. The prognostic signature was constructed by those lncRNAs. Prognostic model was able to calculate each COAD patient’s risk score and split the patients to groups of low and high risk. Compared to the low-risk group, the high-risk group had significant poor prognosis. Then, the prognostic signature was validated in validation and all cohorts. The receiver operating characteristic (ROC) curve and c-index were performed in all cohort. Moreover, those prognostic lncRNAs signature were combined with clinicopathological risk factors to construct a nomogram for predicting the prognosis of COAD in clinic. Finally, 7 lncRNAs (CTC-273B12.10, AC009404.2, AC073283.7, RP11-167H9.4, AC007879.7, RP4-816N1.7, RP11-400N13.2) were identified and validated by different cohorts. The Kyoto Encyclopedia of Genes and Genomes analysis of the mRNAs co-expressed with 7 prognostic lncRNAs suggested 4 significantly up-regulated pathways, which are AGE-RAGE signaling pathway, focal adhesion, ECM-receptor interaction and PI3K/Akt signaling pathway. To sum up, our study verified that the mentioned 7 lncRNAs can be biomarkers to predict the prognosis of COAD patients and design personalized treatment.


On TCGA ( official website, we obtained the RNA-Seq (level 3, HTSeq-FPKM data) with 473 colon adenocarcinoma patients’ relevant clinical information. The approve from the ethics committee was not needed due to clinical data and the RNA‐Seq expression data were obtained from TCGA. Further, the quality of clinical data was assessed, and a part of patients were not included in our studies for following reasons: (a) The clinical prognostic information of some patients is not available; (b) Patients were dead in the first month after diagnosis; (c) The death was caused by other diseases and accidents.