keyboard_arrow_up
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUSTERING

Authors

Abdullah Bawakid
RajshahiUniversity of Jeddah,Saudi Arabia

Abstract

In this paper we describe a new unsupervised algorithm for automatic documents clustering with the aid of Wikipedia. Contrary to other related algorithms in the field, our algorithm utilizes only two aspects of Wikipedia, namely its categories network and articles titles. We do not utilize the inner content of the articles in Wikipedia or their inner or inter links. The implemented algorithm was evaluated in an experiment for documents clustering. The findings we obtained indicate that the utilized features from Wikipedia in our framework can give competing results especially when compared against other models in the literature which employ the inner content of Wikipedia articles

Keywords

Wikipedia, Documents Clustering,Wikipedia Categories