IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 5, SEPTEMBER 2011 711
Feature Relevance Network-Based Transfer Learning
for Indoor Location Estimation
Ho-Sik Seok, Kyu-Baek Hwang, Member, IEEE, and Byoung-Tak Zhang, Member, IEEE
Abstract—We present a new machine learning framework for
indoor location estimation. In many cases, locations could be easily
estimated using various traditional positioning methods and con-
ventional machine learning approaches based on signalling devices,
e.g., access points (APs). When there exist environmental changes,
however, such traditional methods cannot be employed due to data
distribution change. In order to circumvent this difficulty, we in-
troduce feature relevance network-based method, which focuses
on interrelatedness among features. Feature relevance networks
are connected graphs representing concurrency of the signalling
devices such as APs. In the newly created relevance network, a test
instance and the prototype of a location are expanded until con-
vergence. The expansion cost corresponds to distance between the
test instance and the prototype. Unlike other methods, our model is
nonparametric making no assumptions about signal distributions.
The proposed method is applied to the 2007 IEEE International
Conference on Data Mining Data Mining Contest Task #2 (transfer
learning), which is a typical example situation where the training
and test datasets have been gathered during different periods. Us-
ing the proposed method, we accomplish the estimation accuracy
of 0.3238, which is better than the best result of the contest.
Index Terms—Feature relevance networks, indoor location esti-
mation, transfer learning.
I. INTRODUCTION
W
E INTRODUCE a novel algorithm for indoor location
estimation with varying distributions. Unlike most ma-
chine learning problems where distributions of training and test
data are assumed to be the same, 2007 IEEE International Con-
ference on Data Mining (ICDM) Data Mining Contest (DMC)
Task #2 (transfer learning) [1] presents a challenging situation
where the training and test instances have been gathered during
separate periods. As a result, the estimation framework obtained
from traditional positioning methods such as, time difference of
arrival, and roundtrip time of flight [2] cannot be used prop-
Manuscript received January 3, 2010; revised June 16, 2010; accepted
August 16, 2010. Date of publication October 21, 2010; date of current ver-
sion August 19, 2011. This work was supported by the National Research
Foundation of Korea (NRF) grant funded by the Korean government (MEST)
(No. 2010-0017734), the IT R&D Program of MKE/KEIT (KI002138, MARS),
the Industrial Strategic Technology Development Program (10035348) funded
by the Korean government (MKE), and the BK21-IT Program. The ICT at Seoul
National University provides research facilities for this study. Kyu-Baek Hwang
was supported by the Soongsil University Research Fund. This paper was rec-
ommended by Associate Editor H. Liu.
H.-S. Seok and B.-T. Zhang are with the School of Computer Science
and Engineering, Seoul National University, Seoul 151-744, Korea (e-mail:
hsseok@bi.snu.ac.kr; btzhang@bi.snu.ac.kr).
K.-B. Hwang is with the School of Computing, Soongsil University, Seoul
156-743, Korea (e-mail: kbwhang@ssu.ac.kr).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSMCC.2010.2076277
erly. In addition, conventional machine learning methods are
not readily applicable due to the same reason [3]. As a re-
sult, the accuracy of the location estimation is worsened from
0.8227 (without distribution change) to 0.3223 (with distribu-
tion change).
Most previous approaches in these situations have tried to
transform the parameters of statistically learned models [4],
[5]. In the case of 2007 IEEE ICDM DMC Task #2, radio
signal strength (RSS) is not reliable and the number of available
data instances from the changed distribution is too small to
robustly transform the learned parameters. Therefore, it is nearly
impossible to deploy parameter-transfer-based approaches in a
smooth manner.
In order to alleviate this difficulty, we developed a novel
method built on interrelatedness among features. Intuitively, a
good feature representation is crucial for a successful domain
adaptation [6]. But distributional changes make it difficult to
find a proper representation of features. Our method focuses
on interfeature relationship to construct plausible feature rep-
resentation, which is expected to be resilient to distributional
changes of feature values. The core assumption is that the nearer
the access points (APs) are located, the more probable they are
observed simultaneously. Based on this expectation, we search
for the AP pairs, highly adjacent to each other. Such AP pairs
comprise edges of a graph structure and then, the problem space
is reconstructed using it. When a new test instance is given, it is
mapped onto the new problem space and expanded. More pre-
cisely, the test instance and the prototype of a class are expanded
together until convergence. After that, the most plausible loca-
tion is chosen. In the demanding task of 2007 IEEE ICDM DMC
Task #2 where training and test datasets are obtained from dif-
ferent distributions and there are too few training instances from
the test environment, our method shows superior results com-
pared to those from the previous approaches. We achieve the
accuracy of 0.5831 (upper bound) and 0.3238 (with the current
setting, which is better than the best performance achievement
ever).
The structure of the paper is as follows. In Section II, we
review the related works. Section III explains the 2007 IEEE
ICDM DMC problem and the proposed method. Section IV
presents experimental results. In Section V, we discuss the char-
acteristics of the proposed approach. Section VI summarizes the
paper.
II. RELATED WORKS
Recently, transfer learning is receiving much attention. Trans-
fer learning emphasizes knowledge transfer across domains,
tasks, and distributions that are similar but not the same. In a
1094-6977/$26.00 © 2010 IEEE