fulltext.study @t Gmail

Multi objective SNP selection using pareto optimality

Paper ID Volume ID Publish Year Pages File Format Full-Text
15151 1382 2013 6 PDF Available
Title
Multi objective SNP selection using pareto optimality
Abstract

Biomarker discovery is a challenging task of bioinformatics especially when targeting high dimensional problems such as SNP (single nucleotide polymorphism) datasets. Various types of feature selection methods can be applied to accomplish this task. Typically, using features versus class labels of samples in the training dataset, these methods aim at selecting feature subsets with maximal classification accuracies. Although finding such class-discriminative features is crucial, selection of relevant SNPs for maximizing other properties that exist in the nature of population genetics such as the correlation between genetic diversity and geographical distance of ethnic groups can also be equally important. In this work, a methodology using a multi objective optimization technique called Pareto Optimal is utilized for selecting SNP subsets offering both high classification accuracy and correlation between genomic and geographical distances. In this method, discriminatory power of an SNP is determined using mutual information and its contribution to the genomic–geographical correlation is estimated using its loadings on principal components. Combining these objectives, the proposed method identifies SNP subsets that can better discriminate ethnic groups than those obtained with sole mutual information and yield higher correlation than those obtained with sole principal components on the Human Genome Diversity Project (HGDP) SNP dataset.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► Twelve ethnic groups selected from HGDP dataset are used to demonstrate multi objective SNP selection. ► Pareto Optimal is used for selecting SNPs offering high accuracy and correlation of genomic and geographical distances. ► Pairwise genomic distances of ethnic groups are highly correlated with their geographical distances. ► Chromosome 11 was found to be the one that possessed SNPs yielding to both high correlation and accuracy values.

Keywords
Feature selection; Principal component analysis (PCA); Mutual information (MI); Genomic–geographical distance; Human Genome Diversity Project SNP dataset
First Page Preview
Multi objective SNP selection using pareto optimality
Get Full-Text Now
Don't Miss Today's Special Offer
Price was $35.95
You save - $31
Price after discount Only $4.95
100% Money Back Guarantee
Full-text PDF Download
Online Support
Any Questions? feel free to contact us
Publisher
Database: Elsevier - ScienceDirect
Journal: Computational Biology and Chemistry - Volume 43, April 2013, Pages 23–28
Authors
, , ,
Subjects
Physical Sciences and Engineering Chemical Engineering Bioengineering
Get Full-Text Now
Don't Miss Today's Special Offer
Price was $35.95
You save - $31
Price after discount Only $4.95
100% Money Back Guarantee
Full-text PDF Download
Online Support
Any Questions? feel free to contact us