Application of a deep matrix factorization model on integrated gene expression data

  • Yong Jing Hao
  • , Mi Xiao Hou
  • , Ying Lian Gao
  • , Jin Xing Liu
  • , Xiang Zhen Kong

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Background: Non-negative Matrix Factorization (NMF) has been extensively used in gene expression data. However, most NMF-based methods have single-layer structures, which may achieve poor performance for complex data. Deep learning, with its carefully designed hierarchical structure, has shown significant advantages in learning data features. Objective: In bioinformatics, on the one hand, to discover differentially expressed genes in gene expression data; on the other hand, to obtain higher sample clustering results. It can provide the reference value for the prevention and treatment of cancer. Method: In this paper, we apply a deep NMF method called Deep Semi-NMF on the integrated gene expression data. In each layer, the coefficient matrix is directly decomposed into the basic and coefficient matrix of the next layer. We apply this factorization model on The Cancer Genome Atlas (TCGA) genomic data. Results: The experimental results demonstrate the superiority of Deep Semi-NMF method in identifying differentially expressed genes and clustering samples. Conclusion: The Deep Semi-NMF model decomposes a matrix into multiple matrices and multiplies them to form a matrix. It can also improve the clustering performance of samples while digging out more accurate key genes for disease treatment.

Original languageEnglish
Pages (from-to)359-367
Number of pages9
JournalCurrent Bioinformatics
Volume15
Issue number4
DOIs
StatePublished - 2020
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Clustering
  • Deep semi-NMF
  • Feature selection
  • Gene expression data
  • NMF
  • TCGA

Fingerprint

Dive into the research topics of 'Application of a deep matrix factorization model on integrated gene expression data'. Together they form a unique fingerprint.

Cite this