skip to main content
10.1145/3107411.3108211acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
poster

Challenges in Prediction of different Cancer Stages using Gene Expression Profile of Cancer Patients

Authors Info & Claims
Published:20 August 2017Publication History

ABSTRACT

Despite the plethora of gene expression based cancer biomarkers in the scientific literature, a few make their way to the clinic. In the past, several efforts have been made to predict cancer biomarkers with very limited success so far. One of the challenges in the field of cancer biology is to predict cancer at an early stage. The success of various therapies to treat cancer patients depends on correct identification of stage or progression of cancer. Despite the tremendous progress in the field of genomics and proteomics, the performance of stage classification has not improved substantially. Recently our group also developed CancerCSP, a server with prediction models for discriminating early and late stage of clear cell renal cancer (ccRCC) samples based on the gene expression profile. We achieved maximum accuracy of 72.64% with ROC value 0.81, despite the fact that we tried state of- the-art techniques to improve the performance of our models. This raises the question, why the models fail to discriminate ccRCC patients in the early and late stage with high accuracy. In this poster, the analysis is carried out on ccRCC samples obtained from The Cancer Genome Atlas (TCGA) data portal to understand the reasons for the failure of the stage classification models. Firstly, we performed bin-wise analysis of top 20 genes that can discriminate (single gene-based models using threshold) early and late stage samples with highest ROC. A significant overlap was observed in the expression of each gene in early and late stage samples. Though the number of early and late stage samples varied in different gene expression bins, this was not sufficient to classify both types of samples with high accuracy. As an example, the gene NR3C2 had maximum ROC of 0.67 at expression (log RSEM) of 7.61. There were nearly 70% early stage patients above this threshold that made it an average expression marker but the presence of nearly 55% of late stage patients above this threshold increased the false positives. Secondly, we performed hierarchical clustering of ccRCC samples using 64- gene expression features selected using Weka showed weak concordance with pathological stage. The k-means clustering of patients into four groups showed four separable clusters, but these clusters were not associated with the pathological stage. These observations led to the conclusion that the molecular parameters do not always comply with histopathological features. The third analysis was done to identify patients, which were not predicted correctly by any of the four machine-learning algorithms (SVM, Random Forest, SMO and Naïve Bayes). Many samples were not predicted correctly by any of the four machine-learning methods. The false positives and false negatives belonged to explicit clusters obtained through clustering. This further points out to the interspersed nature of the data to differentiate between histopathological stages of cancer. We reach the conclusion that expression profile of genes is not adequate to classify different stages of cancer samples.

Index Terms

  1. Challenges in Prediction of different Cancer Stages using Gene Expression Profile of Cancer Patients

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ACM-BCB '17: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics
          August 2017
          800 pages
          ISBN:9781450347228
          DOI:10.1145/3107411

          Copyright © 2017 Owner/Author

          Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 August 2017

          Check for updates

          Qualifiers

          • poster

          Acceptance Rates

          ACM-BCB '17 Paper Acceptance Rate42of132submissions,32%Overall Acceptance Rate254of885submissions,29%
        • Article Metrics

          • Downloads (Last 12 months)4
          • Downloads (Last 6 weeks)0

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader