A Scalable K-Anonymization Solution for Preserving
Privacy in an Aging-in-Place Welfare Intercloud
Antorweep Chakravorty, Tomasz Wiktor Wlodarczyk, Chunming Rong
Department of Computer and Electrical Engineering
University of Stavanger
Stavanger, Norway
{antorweep.chakravorty, tomasz.w.wlodarczyk, chunming.rong}@uis.no
Abstract—Aging-in-Place solutions are becoming increasingly
prevalent in our society. New age big data technologies can
harness upon enormous amount of data generated from sensors
in smart homes to provide enabling services. Added care and
preventive services can be furnished through interoperability and
bidirectional dataflow across the value chain. However the
nature of the problem domain which although allows establishing
better care through sharing of information also risks disclosing
complete living behavior of individuals. In this paper, we
introduce and evaluate a novel scalable k-anonymization solution
based upon the distributed map-reduce paradigm for preserving
privacy of the shared data in a welfare intercloud. Our
evaluation benchmarks both information loss and data quality
metrics and demonstrates better scalability/performance than
any other available solutions.
Keywords—privacy; k-anonymization; hadoop; intercloud;
aging in place;
I. INTRODUCTION
The growth of elderly population is to double, in coming
years. In order to maintain, improve the standard of healthcare
services and quality of help, Aging-in-Place (AIP) technologies
[1]–[5] would play a crucial role. Traditional healthcare
services to residential homes could be extended as smart homes
using sensor networks supported by data analytics to deliver
assistive services. One such specific initiative being, the
Safer@Home intercloud [6] at the University of Stavanger.
Through this project using solutions such as Hadoop [7], large
amounts of sensor data from various homes are collected
centrally to effectively perform knowledge discovery
algorithms and establish preventive care. Mynatt et. al. [8]
point to privacy and autonomy challenges that are created by
an AIP platform, due to the nature of data which is extremely
sensitive & personal. At the same time, it is infeasible to
perform analytics on data that are transformed wherein it is
important to record granular events and be able to identify
individuals to whom care needs to be furnished.
AIP services are complex and involve multi-disciplinary
stakeholders at different operational and financial levels.
Analysis results often need to be furnished to different actors
(doctors, specialist, nurses, researchers, commune and third
parties) using different cloud services. For some of these actors
the presented information should be identifiable so as to
provide right care to right individuals. Whereas, the data should
be transformed without losing its truthfulness for other actors.
In an earlier work we introduced a privacy preserving data
analysis framework [9] to maintain data utility, ensure security
and preserve privacy at different stages of the data lifecycle
(collection, storage, processing & sharing). We proposed using
k-anonymization [10] to protect privacy of shared micro data.
Our Contribution: Heuristic based k-anonymization
algorithms lack scalability to data spread across various nodes
in a cluster. Traditional implementations are based on data in
centralized storages that are anonymized and released. The data
being collected from smart homes represents huge volume,
velocity and frequency unsuitable for traditional relational
storage systems. New age No-SQL based solutions, is suited to
handle such kinds of data, but would need a completely
different approach in anonymizing them. We present and
evaluate a novel distributed MapReduce [11] based iterative
scalable k-anonymization solution, build upon a existing and
well accepted multi-dimensional partitioning algorithm called
Mondrian [12] for sharing of welfare data while preserving
privacy of individual and maintaining its utility.
Organization: The rest of the paper is structured as follows.
Section II gives an overview of the Safer@Home intercloud.
Section III provides a background on the different methods and
technologies used in developing our solution. The overview of
our Distributed Multidimensional Anonymization solution is
given in section IV, with its detailed design presented in
section V. Section VI evaluated the solution and the related
work is in section VII. The conclusion is in section VIII.
II. SAFER@HOME INTERCLOUD
The Safer@Home welfare intercloud [6] is a smart system
that supports integrated and assured AIP services for elderly in
a smart home environment, based on recent advances in data-
intensive analysis, wireless communications, machine-to-
machine (M2M) service architecture, security and reliability,
and available broadband in a Fiber-To-The- Home (FTTH)
setting. The system extends and strengthens social networks of
healthcare services by integrating Internet of Things (IoT) in a
smart home with off-site professional service providers.
Supported by a bigdata analytic engine (a key behind the
recent revolution in big-data processing enabling large scale
online social networking), the platform supports intelligent
and scalable ICT-assisted decision-making, integrates and
assures different AIP services, such as: social interaction (via
e.g. video, forum) to prevent social isolation and loneliness,
monitoring services enabling prevention, safety services
reducing anxiety and fear, overall disease management and
2014 IEEE International Conference on Cloud Engineering
978-1-4799-3766-0/14 $31.00 © 2014 IEEE
DOI 10.1109/IC2E.2014.43
424