Research Article

Applying the Fisher score to identify Alzheimer's disease-related genes

Published: June 27, 2016
Genet. Mol. Res. 15(2): gmr8798 DOI: 10.4238/gmr.15028798

Abstract

Biologists and scientists can use the data from Alzheimer's disease (AD) gene expression microarrays to mine AD disease-related genes. Because of disadvantages such as small sample sizes, high dimensionality, and a high level of noise, it is difficult to obtain accurate and meaningful biological information from gene expression profiles. In this paper, we present a novel approach for utilizing AD microarray data to identify the morbigenous genes. The Fisher score, a classical feature selection method, is utilized to evaluate the importance of each gene. Genes with a large between-classes variance and small within-class variance are selected as candidate morbigenous genes. The results using an AD dataset show that the proposed approach is effective for gene selection. Satisfactory accuracy can be achieved by using only a small number of selected genes.

Biologists and scientists can use the data from Alzheimer's disease (AD) gene expression microarrays to mine AD disease-related genes. Because of disadvantages such as small sample sizes, high dimensionality, and a high level of noise, it is difficult to obtain accurate and meaningful biological information from gene expression profiles. In this paper, we present a novel approach for utilizing AD microarray data to identify the morbigenous genes. The Fisher score, a classical feature selection method, is utilized to evaluate the importance of each gene. Genes with a large between-classes variance and small within-class variance are selected as candidate morbigenous genes. The results using an AD dataset show that the proposed approach is effective for gene selection. Satisfactory accuracy can be achieved by using only a small number of selected genes.

About the Authors