Improving Semi-Supervised Classification using Clustering

J.  Arora; M.  Tushir; R.  Kashyap

Research Article

Improving Semi-Supervised Classification using Clustering

Download1350 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/eai.29-7-2019.159793,
    author={J.  Arora and M.  Tushir and R.  Kashyap},
    title={Improving Semi-Supervised Classification using Clustering},
    journal={EAI Endorsed Transactions on Scalable Information Systems},
    volume={7},
    number={25},
    publisher={EAI},
    journal_a={SIS},
    year={2019},
    month={8},
    keywords={Semi-Supervised Clustering, Naive Bayes Classification, Probability, Fuzzy C- means},
    doi={10.4108/eai.29-7-2019.159793}
}

J. Arora
M. Tushir
R. Kashyap
Year: 2019
Improving Semi-Supervised Classification using Clustering
SIS
EAI
DOI: 10.4108/eai.29-7-2019.159793

J. Arora¹, M. Tushir²^,*, R. Kashyap³

1: Assistant Professor, Dept. of Information Technology, MSIT, Affiliated to GGSIPU, Delhi, India
2: Professor, Dept. of Electrical & Electronics Engineering, MSIT, Affiliated to GGSIPU, Delhi, India
3: Graduate, Dept. of Information Technology, MSIT, Affiliated to GGSIPU, Delhi, India

*Contact email: meenatushir@yahoo.com

Abstract

Supervised classification techniques, broadly depend on the availability of labeled data. However, collecting this labeled data is always a tedious and costly process. To reduce these efforts and improve the performance of classification process, this paper proposes a new framework, which combines a most basic classification technique with the semi-supervised process of clustering. Semi-supervised clustering algorithms, aim to increase the accuracy of clustering process by effectively exploring available supervision from a limited amount of labeled data and help to label the unlabeled data. In our paper, a semi-supervised clustering is integrated with naive bayes classification technique which helps to better train the classifier. To evaluate the performance of the proposed technique, we conduct experiments on several real world benchmark datasets. The experimental results show that the proposed approach surpasses the competing approaches in both accuracy and efficiency.

Keywords: Semi-Supervised Clustering, Naive Bayes Classification, Probability, Fuzzy C- means

Received: 2019-05-10
Accepted: 2019-07-18
Published: 2019-08-05
Publisher: EAI

: http://dx.doi.org/10.4108/eai.29-7-2019.159793

Copyright © 2019 J. Arora et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.