skip to main content
10.1145/332306.332328acmconferencesArticle/Chapter ViewAbstractPublication PagesrecombConference Proceedingsconference-collections
Article
Free Access

Tissue classification with gene expression profiles

Published:08 April 2000Publication History

ABSTRACT

Constantly improving gene expression profiling technologies are expected to provide understanding and insight into cancer related cellular processes. Gene expression data is also expected to significantly and in the development of efficient cancer diagnosis and classification platforms. In this work we examine two sets of gene expression data measured across sets of tumor and normal clinical samples One set consists of 2,000 genes, measured in 62 epithelial colon samples [1]. The second consists of ≈ 100,000 clones, measured in 32 ovarian samples (unpublished, extension of data set described in [26]).

We examine the use of scoring methods, measuring separation of tumors from normals using individual gene expression levels. These are then coupled with high dimensional classification methods to assess the classification power of complete expression profiles. We present results of performing leave-one-out cross validation (LOOCV) experiments on the two data sets. employing SVM [8], AdaBoost [13] and a novel clustering based classification technique. As tumor samples can differ from normal samples in their cell-type composition we also perform LOOCV experiments using appropriately modified sets of genes, attempting to eliminate the resulting bias.

We demonstrate success rate of at least 90% in tumor vs normal classification, using sets of selected genes, with as well as without cellular contamination related members. These results are insensitive to the exact selection mechanism, over a certain range.

References

  1. 1.U. Alan, N. Barkai, D.A Notterman, K. Glsh, S. Ybarra, D. Mack, and A. J. Levine. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat. Acad. Sc~. USA, 96:6745-6750, 1999Google ScholarGoogle ScholarCross RefCross Ref
  2. 2.A. Ben-Dor, R. Shamir, and g Yakhini. Clustering gene expression patterns. Journal of Computalzonal Bzology, 6:281-297, 1999Google ScholarGoogle Scholar
  3. 3.C. M. Bishop. Neural Networks }o,' Pattern Recogmtton Oxford University Press, Oxford, U.K., 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.M.P.S. Brown, W.N. Grundy, D. {,in, N. Cnst~anini, C Sugnet, T.S. Furey, M. Ares Jr., and D. Haussler Knowledge-based analysis of microarray gene expression data using support vector machines. Technical Report UCSC-CRL-99-09, U C Santa Cruz, 1999Google ScholarGoogle Scholar
  5. 5.C. J. C. Bm'ges A tutorial on Support Vector Machines for pattern recognition. Data M2nmff and Knowledge D~scovery, 2, 121-167, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.S. Chu, J. DeRisi, M. Eisen, J Munholland, D Botstein, P. Brown, and I Herskowltz. The transcriptional program of sporulation in budding yeast Science, 282 699-705, 1998.Google ScholarGoogle Scholar
  7. 7.P. A. Clas'ke, M George, D Cmmingham, 1. Swift, and P Workman. Ananlysis of tumor gene expression following chemotherapeutic treatment of patients w~t.h bowel cancer. In Proc. Nature Genet, cs M, croarray Meeting 99, page 39, Scottsdale, Arizona, 1999.Google ScholarGoogle Scholar
  8. 8.C Cortes and V Vapnik. Support vector machines. Machine Learning, 20:273--297, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.J. DeRisi., V. I yer, and P. Brown. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 282:699-705, 1997.Google ScholarGoogle Scholar
  10. 10.R. O. Duds and P. E. Hart. Pattern Class~ficat, on and Scene Analys~s. John Wiley & Sons, New York, 1973.Google ScholarGoogle Scholar
  11. 11.M B. Eisen, P T. Spellman, P.O Brown, and D. Botstem. Cluster analysis and display of genome-wide expression patterns. Proc. Nat. Acad. Sc,. USA, 95:14863-14868, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  12. 12.B. Eventt Cluster Analysts. Edward Arnold, London, third edition, 1993.Google ScholarGoogle Scholar
  13. 13.Y. Freund and R. E. Schapire. A decismn-theoretic generalization of on-line learning and an application to boosting. J. Computer and System Sc,ences, 55:119- 139, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, J.P. Mesirov M. Caasenbeek, H Coller, M.L. Loh, J R. Downing, M.A Cahgiuri, C.D. Bloomfield, and E.S. Lander. Molecular classification of cancer, class ~hscovery and class prediction by gene expression monitoring. $czence, 286.531-537, 1999Google ScholarGoogle Scholar
  15. 15.V.R. lyer, M B Eisen, D.T. Ross, G. Schttler, T. Moore, J.C.F. Lee, } M. Trent, L.M. Staudt, J. Hudson, M.S. Boguski, D. Lashkari, D. Shalon, D. Botstem, and P O. Brown. The transcriptional program in the response of human fibroblasts to serum. Sczence, 283:83-87, 1999.Google ScholarGoogle Scholar
  16. 16.Kim lab home page. http://cmgm.stanford. edu/-kimlab/.Google ScholarGoogle Scholar
  17. 17.J. Khan, R. Simon, M. Bittner, Y. Chen, S. B. Leighton, T. Pohida, P. D Smith, Y. Jiang, G. C. Gooden, J. M. Trent, and P. S. Meltzer. Gene expression profiling of Alveolar rhabdomyosarcoma with eDNA microarrays. Cancer Reasearch, 1998.Google ScholarGoogle Scholar
  18. 18.R. Kohavi. A study of cross-vahdation and bootstrap for accuracy estimation and model selection. In Proc. Fourteenth International Joint Conference on A rt~ficzal Intelhgence (IJCAI '95), pages 1137-1143 Morgan Kaufmann, San Francisco, Cahf, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19.D. J Lockhart, H Dong, M. C Byrne, M T. Follettie, M. V. Gallo, M. S Chee, M. Mittmann, C. Want, M. Kobayashi, H. Horton, and E. L. Brown. DNA expression momtoring by hybridization of high density oligonucleotide arrays. Nature B~otechnology, 14 1675- 1680, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  20. 20.L. Mason, P. Bartlett, and J. Baxter. Direct optimization of margins improves generalization in combined classifiers. In Advances in Neural Informatwn Process- ~ng Systems 11. MIT Press, Cambridge, Mass , 1999 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.C. M. Perou, S. S. Jeffrey, M. v de Rijn, C. A. Rees, M. B Eisen, D. T. Ross, A. Pergamenschikov, C. F Wilhams, S. X. Zhu, J. C. F. Lee, D Lashkari, D Shalon, P. O. Brown, and Botstein D. D~stinctive gene expression patterns in human mammary epithehal ceils and breast cancers. Proc. Nat. Acad. Sc~. USA, 96:9212-9217, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  22. 22.B. D Ripley. Pattern Recognzt,on and Neural Networks. Cambridge University Press: 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.R. E Schapire. The strength of weak learnability Mach,ne Learning, 5:197-227, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24.R E Schapire, Y. Freund, P. Bartlett, and W. S. Lee. Boosting the margin A new explanation for the effectiveness of voting methods. Annals of Stat,st~cs, 26:1651-1686, 1998.Google ScholarGoogle Scholar
  25. 25.T H. Schiedeck, S. Christoph, M Duchrow, and H.P Bruch. Detection of hl6-mrna: new posslbdities in serologic tumor diagnosis of colorec~,al carcinomas. Zentralbl Chit, 123(2):159-162, 1998.Google ScholarGoogle Scholar
  26. 26.M Schummer, W. NG, R. Bumgarner, P. Nelson, B. Schummer, L. Hassell, L R. Baldwm., B. Karlan, and L. Hood. Comperative hybridization of an array of 21,500 ovcrian cDNAs for the discovery of genes overexpressed in overian carcinomas. Gene, 238:375-385, 1999Google ScholarGoogle ScholarCross RefCross Ref
  27. 27.J. Swets. Measuring the accuracy of diagnostic systems. Sc,ence, 240:1285-1293, 1988~Google ScholarGoogle Scholar
  28. 28.V. Vapnik. $tatistwal Learmn# Theory. John Wiley g~ Sons, New York, 1999.Google ScholarGoogle Scholar
  29. 29.X. Wen, S. Furhmann, G. S. Mmheals, D. B. Carr, S. Smith, J L Barker, and R. Somogyl. Largescale temporal gene expression mapping of central nervous system development. Proc. Nat. Acad. Sct. USA, 95.334-339, 1998.Google ScholarGoogle Scholar
  30. 30.Y.Y. Xiang, DY Wang, M Tanaka, M. Suzuki, E. Kiyokawa, H. lgarashi, Y. Naito, Q. Shen, and H. Sugimura. Expressmn of high-mobility group-I mrna in human gastrointestinal adenocarcinoma and corresponding non-cancerous mucosa, lnt J. Cancer, 74(1). 1-6, Feb 1997.Google ScholarGoogle Scholar
  1. Tissue classification with gene expression profiles

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            RECOMB '00: Proceedings of the fourth annual international conference on Computational molecular biology
            April 2000
            329 pages
            ISBN:1581131860
            DOI:10.1145/332306

            Copyright © 2000 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 8 April 2000

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate148of538submissions,28%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader