A New Similarity Measure for the Profiles Management
Ahmed Belkhirat
Department of Information Systems
College of Computer and Information Sciences
King Saud University
Riyadh, Saudi Arabia
belkhirat@ksu.edu.sa
Abdelkader Belkhir
Dept. of computer science
USTHB University
Algiers, Algeria
belkhir@lsi-usthb.dz
Abdelghani Bouras
Industrial Engineering Dept.
College of engineering
King Saud University
Riyadh, Saudi Arabia
bouras@ksu.edu.sa
Abstract— The measure of similarity is necessary for the
study of several problems such as: the multimedia
adaptation, detection of intrusion based behavior,
adaptation of web services …. In this article, the
definition of a new measure of similarity that deals with
the shared objects properties, their values and the
weight of each property is proposed.
Keywords- similarity measure; Jaccard factor; user profil;
characteristic weight
I. INTRODUCTION
The objects identification is a recurrent problem in
several applications. This identification is applied in
security [1, 2], adaptation in multimedia systems, or in web
service [3], and auto configuration (domestic network) [4].
Thus, every object (user, web service, multimedia
document) is represented by a profile that describes the
object [5] thanks to its features. For this reason, we can use
a similarity measure [6, 7] in order to determine the
resemblance of two objects.
This resemblance can be dictated by the common properties
of two objects. However, the practice shows that the
objects’ properties are quantified. It requires the
representation of the similarity measure in terms of the
shared properties and their values.
In several cases, it is necessary to take into account the
relevance of some object properties. In case of
predominance of some properties in the object description,
we must refine the similarity measure in terms of properties,
their values and their relevance.
This article is organized as follows: the section 2
presents a measure inspired from Jaccard similarity measure
[5] while taking into account the quantization or atomization
of properties. We demonstrate that the proposed measure
verifies the properties of a similarity measure. Then, we give
the equivalent distance as well as its properties. Section 3
presents a refinement of the measure presented in section 2.
In a similar manner, we will verify the properties of this
measure. Again, we provide the equivalent distance measure
and verify its properties. Section 4 presents two properties of
our measure face to usual measures. Then we conclude on
the importance of the new similarity measure.
II. SIMILARITY MEASURE
Let P be a set of objects profiles (individuals, documents,
web sites …). These profiles, noted x, are described by n
characteristics x
i
: x= ( )
n
x x x ,..., ,
2 1
. A similarity measure,
noted sim is defined by an application from
) ( IN P P P ⊆ × into [ ] 1 , 0 .
[ ] 1 , 0 : → × P P sim
It verifies the following properties:
( ) 0 , : , ) 1 ( ≥ ∈ ∀ y x sim P y x P
( ) ( ) ( ) y x sim y y sim x x sim P y x P , , , : , ) 2 ( ≥ = ∈ ∀
( ) ( ) x y sim y x sim P y x P , , : , ) 3 ( = ∈ ∀
We note that the more the objects have features in common,
the more they are similar. The similarity is maximal for two
identical objects. Inversely, it decreases when most of
features are different.
A measure of distance, noted dist , is a defined application
from P P × into [ ] 1 , 0 .
[ ] 1 , 0 : → × P P dist
It verifies the following properties:
( ) 0 , : , ) 4 ( ≥ ∈ ∀ y x dist P y x P
( ) 0 , : , ) 5 ( = ∈ ∀ x x dist P y x P
( ) ( ) x y dist y x dist P y x P , , : , ) 6 ( = ∈ ∀
2011 UKSim 13th International Conference on Modelling and Simulation
978-0-7695-4376-5/11 $26.00 © 2011 IEEE
DOI 10.1109/UKSIM.2011.55
255