Robust Principal Component Analysis Regularized by Truncated Nuclear Norm for Identifying Differentially Expressed Genes

  • Ya Xuan Wang
  • , Ying Lian Gao
  • , Jin Xing Liu
  • , Xiang Zhen Kong
  • , Hai Jun Li

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.

Original languageEnglish
Article number7968372
Pages (from-to)447-454
Number of pages8
JournalIEEE Transactions on Nanobioscience
Volume16
Issue number6
DOIs
StatePublished - Sep 2017
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Differentially expressed genes
  • TCGA data
  • robust principal component analysis
  • sparse constraint
  • truncated nuclear norm

Fingerprint

Dive into the research topics of 'Robust Principal Component Analysis Regularized by Truncated Nuclear Norm for Identifying Differentially Expressed Genes'. Together they form a unique fingerprint.

Cite this