fulltext.study @t Gmail

Using ensemble methods to deal with imbalanced data in predicting protein–protein interactions

Paper ID Volume ID Publish Year Pages File Format Full-Text
15207 1391 2012 6 PDF Available
Title
Using ensemble methods to deal with imbalanced data in predicting protein–protein interactions
Abstract

In proteins, the number of interacting pairs is usually much smaller than the number of non-interacting ones. So the imbalanced data problem will arise in the field of protein–protein interactions (PPIs) prediction. In this article, we introduce two ensemble methods to solve the imbalanced data problem. These ensemble methods combine the based-cluster under-sampling technique and the fusion classifiers. And then we evaluate the ensemble methods using a dataset from Database of Interacting Proteins (DIP) with 10-fold cross validation. All the prediction models achieve area under the receiver operating characteristic curve (AUC) value about 95%. Our results show that the ensemble classifiers are quite effective in predicting PPIs; we also gain some valuable conclusions on the performance of ensemble methods for PPIs in imbalanced data. The prediction software and all dataset employed in the work can be obtained for free at http://cic.scu.edu.cn/bioinformatics/Ensemble_PPIs/index.html.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► Two ensemble methods are proposed to overcome the imbalanced problem in PPIs. ► These methods combine cluster-based under-sampling technique and fusion classifiers. ► Analysis the performance of these methods with different based classifiers. ► A web server has been developed in an easy-to-use manner.

Keywords
PPIs, protein–protein interactions; DIP, Database of Interacting Proteins; AC, auto covariance; SVM, support vector machine; ANN, artificial neural network; ROC, receiver operating characteristic; AUC, area under ROC curveProtein–protein interaction; Ense
First Page Preview
Using ensemble methods to deal with imbalanced data in predicting protein–protein interactions
Get Full-Text Now
Don't Miss Today's Special Offer
Price was $35.95
You save - $31
Price after discount Only $4.95
100% Money Back Guarantee
Full-text PDF Download
Online Support
Any Questions? feel free to contact us
Publisher
Database: Elsevier - ScienceDirect
Journal: Computational Biology and Chemistry - Volume 36, February 2012, Pages 36–41
Authors
, , , , , , , ,
Subjects
Physical Sciences and Engineering Chemical Engineering Bioengineering
Get Full-Text Now
Don't Miss Today's Special Offer
Price was $35.95
You save - $31
Price after discount Only $4.95
100% Money Back Guarantee
Full-text PDF Download
Online Support
Any Questions? feel free to contact us