Applying User Profiles in Transient Peer-to-Peer
Environment
Bertalan Forstner, Imre Kelényi and Hassan Charaf
Department of Automation and Applied Informatics
Budapest University of Technology and Economics
Budapest, Hungary
{bertalan.forstner, imre.kelenyi, hassan.charaf}@aut.bme.hu
Abstract—Semantic data is widely used in order to increase the
performance of Peer-to-Peer information retrieval networks. The
most efficient approaches construct user profiles in order to
describe the fields of interest of the users and to shape a semantic
overlay network. However, the characteristics of mobile devices
and the behavior of wireless Peer-to-Peer users require the
consideration of the applied algorithms and protocols when
applying them in such environment. In this paper we describe
our experiments with a special mobile Gnutella client that
collected information on the mobile user behavior, together with
the other distinctiveness of the mobile environment. We also
propose an appropriate utilization of user profiles in transient
Peer-to-Peer systems.
Keywords- information retrieval; mobile devices; peer-to-peer
I. INTRODUCTION
The emerging demand for mobile file sharing solutions can
be observed by the popularity of the available applications.
Mobile market surveys also report the need for such software
which can help in sharing content made by the smartphone
users, such as images, videos, audio records or other notes [1].
An ideal approach could be the Peer-to-Peer (P2P) technology,
which is widely used in desktop environment for such
purposes. The last few years we collected first hand experience
with a Symbian-based mobile Gnutella client, the Symella
application [15], in order to learn usage patterns and other
characteristics of the application of this technology in the
mobile environment.
Early P2P protocols suffer from scalability issues: with the
growth of the number of nodes also the amount of required
network traffic (or other resources) increases notably to reach
reasonable hit rate. The efforts dealing with this issue can be
classified between two significantly different approaches: they
can be structured or unstructured.
The structured P2P protocols (for example [2][3][4][5])
specify strict rules for the location of documents to be stored,
or define which other peers a node can connect to. Although
these networks have usually good scalability properties, and
their performance can be estimated quite accurately, they are
becoming disadvantageous in networks with strong transient
character: they can handle the frequent changes in the network
population with difficulties and with great resource expenses.
The second approach examines unstructured networks such as
the basic Gnutella protocol [6]. In that case there is no rule for
the location of the documents to store, and the connections of
the nodes are controlled by a few simple rules. For that reason,
these systems have limited protocol overhead and can tolerate
when nodes frequently enter to and leave from the network.
Recently some systems were developed to improve the
search performance of P2P networks; some of them try to
achieve this in cooperative manner, with semantic overlay
networks. These are built on the fact that the fields of interest
belonging to the users can be determined, and nodes with
greater expectable recall value can be found. The first group of
these algorithms tries to achieve better hit rate based on run-
time statistics [7][8][9]. The second and more efficient group
of the content-aware Peer-to-Peer algorithms uses metadata
provided for the documents in the system. These metadata can
be keywords describing a document or any other method to
classify different kinds of information. A simple example for
the metadata of music files can be the ID3 tags attached to the
mp3 music files, describing the author, title, performer, album,
genre or other information of the music. This information can
be used to deduct the fields of interest of the user, and then find
users that share similar interests. Direct connections to such
nodes can increase the hit rate in a P2P network and decrease
the number of messages necessary to find that information.
Although some of the approaches give quite good results in
desktop environment, their performance and usability
decreases when deployed to mobile devices. The reason is in
the special characteristics of mobile Peer-to-Peer systems.
The rest of this paper is organized as follows. After this
introductory part a summary of the characteristics of the mobile
networks follows. In Section III we write about our user
modeling technique. In the last two sections we evaluate our
results, conclude our work and raise open questions.
II. CHARACTERISTICS OF THE MOBILE ENVIRONMENT
We developed a special version of the mobile Gnutella
client for the users volunteered to provide us usage statistics
and semantic information collected by the application. We also
examined the properties of the available handsets in the market.
In this section, we will conclude the results of our experiment.
The modified crawler client used a taxonomy that classified
the different music styles according to their ID3 tag. It logged
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2008 workshop proceedings.
978-1-4244-2052-0/08/$25.00 ©2008 IEEE