fulltext.study @t Gmail

CE-PLoc: An ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition

Paper ID Volume ID Publish Year Pages File Format Full-Text
15224 1393 2011 12 PDF Available
Title
CE-PLoc: An ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition
Abstract

Precise information about protein locations in a cell facilitates in the understanding of the function of a protein and its interaction in the cellular environment. This information further helps in the study of the specific metabolic pathways and other biological processes. We propose an ensemble approach called “CE-PLoc” for predicting subcellular locations based on fusion of individual classifiers. The proposed approach utilizes features obtained from both dipeptide composition (DC) and amphiphilic pseudo amino acid composition (PseAAC) based feature extraction strategies. Different feature spaces are obtained by varying the dimensionality using PseAAC for a selected base learner. The performance of the individual learning mechanisms such as support vector machine, nearest neighbor, probabilistic neural network, covariant discriminant, which are trained using PseAAC based features is first analyzed. Classifiers are developed using same learning mechanism but trained on PseAAC based feature spaces of varying dimensions. These classifiers are combined through voting strategy and an improvement in prediction performance is achieved. Prediction performance is further enhanced by developing CE-PLoc through the combination of different learning mechanisms trained on both DC based feature space and PseAAC based feature spaces of varying dimensions. The predictive performance of proposed CE-PLoc is evaluated for two benchmark datasets of protein subcellular locations using accuracy, MCC, and Q-statistics. Using the jackknife test, prediction accuracies of 81.47 and 83.99% are obtained for 12 and 14 subcellular locations datasets, respectively. In case of independent dataset test, prediction accuracies are 87.04 and 87.33% for 12 and 14 class datasets, respectively.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► We present, a novel ensemble approach called “CE-PLoc” for predicting subcellular locations. ► This approach utilizes dipeptide and PseAAC based feature extraction strategies. ► The performance of the proposed CE-PLoc is evaluated on two benchmark datasets. ► It is observed that CE-PLoc provides better performance as compared to the existing approaches.

Keywords
Ensemble classifier; Protein subcellular location; Pseudo amino acid composition; Dipeptide composition
First Page Preview
CE-PLoc: An ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition
Get Full-Text Now
Don't Miss Today's Special Offer
Price was $35.95
You save - $31
Price after discount Only $4.95
100% Money Back Guarantee
Full-text PDF Download
Online Support
Any Questions? feel free to contact us
Publisher
Database: Elsevier - ScienceDirect
Journal: Computational Biology and Chemistry - Volume 35, Issue 4, 10 August 2011, Pages 218–229
Authors
, , ,
Subjects
Physical Sciences and Engineering Chemical Engineering Bioengineering
Get Full-Text Now
Don't Miss Today's Special Offer
Price was $35.95
You save - $31
Price after discount Only $4.95
100% Money Back Guarantee
Full-text PDF Download
Online Support
Any Questions? feel free to contact us