International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 11 | Nov -2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1420
Analysis of Document Clustering using Pseudo Dynamic Quantum
Clustering Approach
Sahinur Rahman Laskar
1
, Bhagaban Swain
2
1,2
Department of Computer Science & Engineering
Assam University, Silchar, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - In the field of information processing like
data mining, information retrieval, natural language
processing and machine learning, Quantum Computing
play vital role for extracting the implicit, potentially
useful and previously unknown information from huge
sets of data. In [1] and [2], proposed two techniques of
document ranking and document clustering with Quantum
concept. The Quantum Clustering (QC) technique used for
information processing which is basically depends on time
independent Schrödinger equation for clustering the data
and the Dynamic Quantum Clustering (DQC) came into
existence when dynamic of the system is computed by
means of the time dependent Schrödinger equation. The
Dynamic Quantum Clustering (DQC),is a recent clustering
technique based on physical perception from quantum
mechanics where clusters are computed by means of the
time dependent Schrödinger equation and clusters are
identified as the minima of the potential function of the
Schrödinger equation. In this paper, proposed a novel
approach i.e. Pseudo Dynamic Quantum Clustering by
considering time dependent Schrödinger equation for
clustering the documents such that which provides an
agreeable performance in terms of the quality of clusters
and the efficiency of the computation in comparison of
existing classical approach and earlier proposed approach
[2].
Key Words: Clustering; Document Clustering; QC; DQC;
Pseudo Dynamic Quantum Clustering.
1.INTRODUCTION
The objective of Clustering technique [3],[4] of
automatically classify and organizing a large data
collection into a finite and discrete set of data rather than
provide an accurate characterization of unobserved
samples generated from the same probability distribution.
This problem is implied in the sense that any given set of
objects can be clustered in different ways with no clear
criterion for preferring one clustering over another. This
challenge influenced researchers to improve classical
clustering algorithm and thus Quantum Clustering (QC)
technique [5],[6],[7] was developed. The QC is a broad
process in the field of Quantum Information Processing
(QIP), this approach brings out Schrödinger equation in
existence for clustering the data but not on documents. In
[2], proposed a technique which mainly focused on
document clustering using time independent QC
algorithm. The Dynamic Quantum Clustering (DQC) is a
recent clustering method based on time dependent
Schrödinger equation. This scenario is discussed in the
section 3. The DQC and QC are differentiated based on the
fact that DQC is computed in dynamic of the system rather
than static system. This paper is the extension of the
worked [2] i.e. analyses of document clustering using the
proposed new approach i.e. Pseudo Dynamic Quantum
Clustering.
The remainder of the paper is organized as
follows. In section 2, a brief description of document
clustering concept is presented. Section 3 addresses the
method of dynamic quantum clustering based on time
dependent Schrödinger equation. The proposed method
i.e. Pseudo Dynamic Quantum Clustering is described in
section 4. Experiments and results over a standard data
set and comparison with existing method is given in
section 5 and Finally concluding remarks for future
research direction in section 6.
2. DOCUMENT CLUSTERING
Document clustering [8], [9] organizes documents into
different groups called as clusters, where the documents
in each cluster share some common features according
to defined similarity measure. Document Clustering is
different than document classification based on the fact
that in document classification, the classes (and their
properties) are known a priori, and documents are
assigned to these classes; whereas, in document
clustering, the number, properties, or membership
(composition) of classes is not known in advance. Thus,
classification is an example of supervised machine
learning and clustering that of unsupervised machine
learning. The well-known existing classical clustering
techniques [10] like k-means, Agglomerative hierarchical
clustering are available but limited in terms of quality of
clusters which leads to the motivation of the development
of quantum inspired clustering technique, where clusters
are computed through the minima of the potential
function of the Schrödinger equation. This scenario will be
discussed in the next section.