Research Article
Quasi-Identifier Recognition Algorithm for Privacy
Preservation of Cloud Data Based on Risk Reidentification
Huda O. Mansour ,
1,2
Maheyzah M. Siraj ,
2
Fuad A. Ghaleb ,
1
Faisal Saeed ,
3
Eman H. Alkhammash ,
4
and Mohd A. Maarof
1
1
Faculty of Engineering, School of Computing, Universiti Teknologi Malaysia (UTM), Johor 81310, Malaysia
2
Department of Computer Science, Faculty of Computer Science and Information Technology, University of Kassala,
Kassala 31111, Sudan
3
College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
4
Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099,
Taif 21944, Saudi Arabia
Correspondence should be addressed to Fuad A. Ghaleb; abdulgaleel@utm.my
Received 30 April 2021; Revised 26 June 2021; Accepted 9 August 2021; Published 26 August 2021
Academic Editor: Ihsan Ali
Copyright © 2021 Huda O. Mansour et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
Cloud computing plays an essential role as a source for outsourcing data to perform mining operations or other data processing,
especially for data owners who do not have sufficient resources or experience to execute data mining techniques. However, the
privacy of outsourced data is a serious concern. Most data owners are using anonymization-based techniques to prevent
identity and attribute disclosures to avoid privacy leakage before outsourced data for mining over the cloud. In addition, data
collection and dissemination in a resource-limited network such as sensor cloud require efficient methods to reduce privacy
leakage. The main issue that caused identity disclosure is quasi-identifier (QID) linking. But most researchers of
anonymization methods ignore the identification of proper QIDs. This reduces the validity of the used anonymization methods
and may thus lead to a failure of the anonymity process. This paper introduces a new quasi-identifier recognition algorithm
that reduces identity disclosure which resulted from QID linking. The proposed algorithm is comprised of two main stages: (1)
attribute classification (or QID recognition) and (2) QID dimension identification. The algorithm works based on the
reidentification of risk rate for all attributes and the dimension of QIDs where it determines the proper QIDs and their
suitable dimensions. The proposed algorithm was tested on a real dataset. The results demonstrated that the proposed
algorithm significantly reduces privacy leakage and maintains the data utility compared to recent related algorithms.
1. Introduction
In the modern information age, many companies are using
external sources of data for processing, storing, or obtaining
some services such as data mining. Unlimited computational
resources, reduced costs, nonburden of maintenance, and
nondiligence to learn the skills of proficiency in certain ser-
vices, all of these were temptations to advance to the modern
change. However, there are still security and privacy con-
cerns that hinder the use of the features offered by the cloud
[1]. Numerous studies clarified that attackers often reveal
the information from third-party services or third-party
clouds [2]. For example, one of the security breaches in
October 2014 was a breakthrough for Dropbox. The
attackers stole 700 user passwords to obtain cash values of
its Bitcoins (BTC). In 2015, a lot of users’ information,
which exceeds 4 million, such as the user’s name, date of
birth, address, e-mail, phone number, and other sensitive
data, were leaked through the TalkTalk service provider in
the UK. In 2016, Time Warner, one of the largest cable tele-
vision companies in the United States, has announced that
about 32 million passwords and e-mail of the users have
been stolen via an attacker. In 2017, more than 200 million
data of the users containing users’ names, phone numbers,
Hindawi
Wireless Communications and Mobile Computing
Volume 2021, Article ID 7154705, 13 pages
https://doi.org/10.1155/2021/7154705