Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data ☆
Gene expression data sets hold the promise to provide cancer diagnosis on the molecular level. However, using all the gene profiles for diagnosis may be suboptimal. Detection of the molecular signatures not only reduces the number of genes needed for discrimination purposes, but may elucidate the roles they play in the biological processes. Therefore, a central part of diagnosis is to detect a small set of tumor biomarkers which can be used for accurate multiclass cancer classification. This task calls for effective multiclass classifiers with built-in biomarker selection mechanism.We propose the sparse optimal scoring (SOS) method for multiclass cancer characterization. SOS is a simple prototype classifier based on linear discriminant analysis, in which predictive biomarkers can be automatically determined together with accurate classification. Thus, SOS differentiates itself from many other commonly used classifiers, where gene preselection must be applied before classification. We obtain satisfactory performance while applying SOS to several public data sets.
Journal: Computational Biology and Chemistry - Volume 32, Issue 6, December 2008, Pages 417–425