View Article |
Supervised clustering based on a multi-objective genetic algorithm
Thananant, Vipa1, Auwatanamongkol, Surapong2.
Supervised clustering organizes data instances into clusters on the basis of similarities
between the data instances as well as class labels for the data instances. Supervised
clustering seeks to meet multiple objectives, such as compactness of clusters, homogeneity
of data in clusters with respect to their class labels, and separateness of clusters. With
these objectives in mind, a new supervised clustering algorithm based on a multi-objective
crowding genetic algorithm, named SC-MOGA, is proposed in this paper. The algorithm
searches for the optimal clustering solution that simultaneously achieves the three objectives
mentioned above. The SC-MOGA performs very well on a small dataset, but for a large
dataset it may not be able to converge to an optimal solution or can take a very long running
time to converge to a solution. Hence, a data sampling method based on the Bisecting
K-Means algorithm is also introduced, to find representatives for supervised clustering.
This method groups the data instances of a dataset into small clusters, each containing
data instances with the same class label. Data representatives are then randomly selected
from each cluster. The experimental results show that SC-MOGA with the proposed data
sampling method is very effective. It outperforms three previously proposed supervised
clustering algorithms, namely SRIDHCR, LK-Means and SCEC, in terms of four cluster
validity indexes. The experimental results show that the proposed data sampling method
not only helps to reduce the number of data
instances to be clustered by the SC-MOGA,
but also enhances the quality of the data
clustering results.
Affiliation:
- National Institute of Development Administration, Thailand
- National Institute of Development Administration, Thailand
Download this article (This article has been downloaded 185 time(s))
|
|
Indexation |
Indexed by |
MyJurnal (2021) |
H-Index
|
3 |
Immediacy Index
|
0.000 |
Rank |
0 |
Indexed by |
Scopus 2020 |
Impact Factor
|
CiteScore (1.1) |
Rank |
Q3 (Agricultural and Biological Sciences (all)) Q3 (Environmental Science (all)) Q3¬¬- (Computer Science (all)) Q3 (Chemical Engineering (all)) |
Additional Information |
SJR (0.174) |
|
|
|