PCA Based on Graph Laplacian Regularization and P-Norm for Gene Selection and Clustering

  • Chun Mei Feng
  • , Ying Lian Gao
  • , Jin Xing Liu
  • , Chun Hou Zheng
  • , Jiguo Yu

Research output: Contribution to journalArticlepeer-review

40 Scopus citations

Abstract

In modern molecular biology, the hotspots and difficulties of this field are identifying characteristic genes from gene expression data. Traditional reconstruction-error-minimization model principal component analysis (PCA) as a matrix decomposition method uses quadratic error function, which is known sensitive to outliers and noise. Hence, it is necessary to learn a good PCA method when outliers and noise exist. In this paper, we develop a novel PCA method enforcing P-norm on error function and graph-Laplacian regularization term for matrix decomposition problem, which is called as PgLPCA. The heart of the method designing for reducing outliers and noise is a new error function based on non-convex proximal P-norm. Besides, Laplacian regularization term is used to find the internal geometric structure in the data representation. To solve the minimization problem, we develop an efficient optimization algorithm based on the augmented Lagrange multiplier method. This method is used to select characteristic genes and cluster the samples from explosive biological data, which has higher accuracy than compared methods.

Original languageEnglish
Article number7891019
Pages (from-to)257-265
Number of pages9
JournalIEEE Transactions on Nanobioscience
Volume16
Issue number4
DOIs
StatePublished - Jun 2017
Externally publishedYes

Keywords

  • Gene selection
  • Laplacian embed
  • non-convex proximal P-norm
  • principal component analysis

Fingerprint

Dive into the research topics of 'PCA Based on Graph Laplacian Regularization and P-Norm for Gene Selection and Clustering'. Together they form a unique fingerprint.

Cite this