252
Copyright © 2016, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Chapter 12
DOI: 10.4018/978-1-5225-0075-9.ch012
ABSTRACT
Document clustering, which involves concepts from the felds of information retrieval, automatic topic
extraction, natural language processing, and machine learning, is one of the most popular research areas
in data mining. Due to the large amount of information in electronic form, fast and high-quality cluster
analysis plays an important role in helping users to efectively navigate, summarize and organise this
information for useful data. There are a number of techniques in the literature, which efciently provide
solutions for document clustering. However, during the last decade, researchers started to use metaheuris-
tic algorithms for the document clustering problem because of the limitations of the existing traditional
clustering algorithms. In this chapter, the authors will give a brief review of various research papers
that present the area of document or text clustering approaches with diferent metaheuristic algorithms.
INTRODUCTION
Exponential growth of text documents’ volumes is accelerated by a noticeable increase in digital libraries
and repositories, social networking applications, company-wide intranets, digitized personal information
such as blog articles and emails, etc. The effective usage of computers as well as Internet adds billions
of electronic documents to the search area. This increase in both the volume and the variety of text docu-
ments requires advances in methodology to automatically understand, process, and summarize the data.
Fast and high-quality cluster analysis plays an important role in helping users to effectively navigate,
summarize and organize the large amount of information.
A Brief Review of
Metaheuristics for Document
or Text Clustering
Sinem Büyüksaatçı
Istanbul University, Turkey
Alp Baray
Istanbul University, Turkey