View Article |
A comparative study on gene selection methods for tissues classification on large scale gene expression data
Farzana Kabir Ahmad1.
Deoxyribonucleic acid (DNA) microarray technology is the recent invention that
provided colossal opportunities to measure a large scale of gene expressions
simultaneously. However, interpreting large scale of gene expression data remain a
challenging issue due to their innate nature of “high dimensional low sample size”.
Microarray data mainly involved thousands of genes, n in a very small size sample, p
which complicates the data analysis process. For such a reason, feature selection
methods also known as gene selection methods have become apparently need to
select significant genes that present the maximum discriminative power between
cancerous and normal tissues. Feature selection methods can be structured into three
basic factions; a) filter methods; b) wrapper methods and c) embedded methods.
Among these methods, filter gene selection methods provide easy way to calculate the
informative genes and can simplify reduce the large scale microarray datasets.
Although filter based gene selection techniques have been commonly used in
analyzing microarray dataset, these techniques have been tested separately in
different studies. Therefore, this study aims to investigate and compare the effectiveness
of these four popular filter gene selection methods namely Signal-to-Noise ratio (SNR),
Fisher Criterion (FC), Information Gain (IG) and t-Test in selecting informative genes that
can distinguish cancer and normal tissues. In this experiment, common classifiers,
Support Vector Machine (SVM) is used to train the selected genes. These gene selection
methods are tested on three large scales of gene expression datasets, namely breast
cancer dataset, colon dataset, and lung dataset. This study has discovered that IG and
SNR are more suitable to be used with SVM. Furthermore, this study has shown SVM
performance remained moderately unaffected unless a very small size of genes was
selected.
Affiliation:
- Universiti Utara Malaysia, Malaysia
Toggle translation
Download this article (This article has been downloaded 131 time(s))
|
|
Indexation |
Indexed by |
MyJurnal (2021) |
H-Index
|
6 |
Immediacy Index
|
0.000 |
Rank |
0 |
Indexed by |
Scopus 2020 |
Impact Factor
|
CiteScore (1.4) |
Rank |
Q3 (Engineering (all)) |
Additional Information |
SJR (0.191) |
|
|
|