ACM Home Page
Please provide us with feedback. Feedback
Dimension Reduction-Based Penalized Logistic Regression for Cancer Classification Using Microarray Data
Full text PdfPdf (973 KB)
Source IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) archive
Volume 2 ,  Issue 2  (April 2005) table of contents
Pages: 166 - 175  
Year of Publication: 2005
ISSN:1545-5963
Authors
Publisher
IEEE Computer Society Press  Los Alamitos, CA, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 130,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: 10.1109/TCBB.2005.22

ABSTRACT

The use of penalized logistic regression for cancer classification using microarray expression data is presented. Two dimension reduction methods are respectively combined with the penalized logistic regression so that both the classification accuracy and computational speed are enhanced. Two other machine-learning methods, support vector machines and least-squares regression, have been chosen for comparison. It is shown that our methods have achieved at least equal or better results. They also have the advantage that the output probability can be explicitly given and the regression coefficients are easier to interpret. Several other aspects, such as the selection of penalty parameters and components, pertinent to the application of our methods for cancer classification are also discussed.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
P.O. Brown and D. Botstein, “Exploring the New World of the Genome with DNA Microarrays,” <i>Nature Genetics Supplement,</i> vol. 21, pp. 33-37, Jan. 1999.
 
2
C. Debouck and P.N. Goodfellow, “DNA Microarrays in Drug Discovery and Development,” <i>Nature Genetics Supplement,</i> vol. 21, pp. 48-50, Jan. 1999.
 
3
D.J. Duggan et al., “Expression Profiling Using cDNA Microarrays,” <i>Nature Genetics Supplement,</i> vol. 21, pp. 10-14, Jan. 1999.
 
4
C. Peterson and M. Ringnér, “Analyzing Tumor Gene Expression Profiles,” <i>Artificial Intelligence in Medicine,</i> vol. 28, no. 1, pp. 59-74, May 2003.
 
5
T.S. Furey et al., “Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data,” <i>Bioinformatics,</i> vol. 16, no. 10, pp. 906-914, 2000.
 
6
P.H.C. Eilers et al., “Classification of Microarray Data with Penalized Logistic Regression,” <i>Proc. SPIE,</i> vol. 4266, no. 2, pp.nbsp187-198, 2001.
 
7
M.G. Schimek, “Penalized Logistic Regression in Gene Expression Analysis,” <i>Proc. The Art of Semiparametrics Conf.,</i> http://apus.wiwi.hu-berlin.de/statistik/aos2003/schimek/schimek.pdf, Oct. 2003.
 
8
J. Zhu and T. Hastie, “Classification of Gene Microarrays by Penalized Logistic Regression,” <i>Biostatistics,</i> vol. 5, no. 3, pp. 427-443, 2004.
 
9
T. Hastie R. Tibshirani and J. Friedman, <i>The Elements of Statistical Learning: Data Mining, Inference, and Prediction.</i> New York: Springer, 2001.
 
10
A.E. Hoerl and R.W. Kennard, “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” <i>Technometrics,</i> vol. 12, no. 1, pp. 55-67, 1970.
 
11
S. le Cessie and J.C. van Houwelingen, “Ridge Estimators in Logistic Regression,” <i>Applied Statistics,</i> vol. 41, no. 1, pp. 191-201, 1992.
 
12
J.A. Wegelin, “A Survey of Partial Least Squares (PLS) Methods, with Emphasis on the Two-Block Case,” technical report, Dept. of Statistics, Univ. of Washington, 2000.
 
13
G.H. Golub and C.F. Van Loan, <i>Matrix Computations.</i> The Johns Hopkins Univ. Press, 1996.
 
14
 
15
B. Efron, “The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis,” <i>J. Am. Statistical Assoc.,</i> vol. 70, no. 352, pp. 892-898, 1975.
 
16
S.J. Press and S. Wilson, “Choosing between Logistic Regression and Discriminant Analysis,” <i>J. Am. Statistical Assoc.,</i> vol. 73,no. 364, pp. 699-705, 1978.
 
17
J. Li and H. Liu, “Kent Ridge Biomedical Data Set Repository,” http://sdmc-lit.org.sg/GEDatasets, 2002.
 
18
A. Schwaighofer, “SVM MATLAB Toolbox,” http://www.cis. tugraz.at/igi/aschwaig/svm_v251.tar.gz, 2001.
 
19
S. Gunn, “SVM MATLAB Toolbox,” http://www.isis.ecs.soto n.ac.uk/resources/svminfo/, 2001.
 
20
T.R. Golub et al., “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring,” <i>Science,</i> vol. 286, pp. 531-537, Oct. 1999.
 
21
D. Singh et al., “Gene Expression Correlates of Clinical Prostate Cancer Behavior,” <i>Cancer Cell,</i> vol. 1, pp. 203-209, Mar. 2002.
 
22
A.C. Tan and D. Gilbert, “Ensemble Machine Learning on Gene Expression Data for Cancer Classification,” <i>Applied Bioinformatics,</i> vol. 2, no. 3, pp. 75-83, 2003.
 
23
 
24
D.V. Nguyen and D.M. Rocke, “Tumor Classification by Partial Least Squares Using Microarray Gene Expression Data,” <i>Bioinformatics,</i> vol. 18, no. 1, pp. 39-50, 2002.
 
25
 
26