Int. J. Knowledge Engineering and Data Mining, Vol. 4, No. 2, 2017 145
Copyright © 2017 Inderscience Enterprises Ltd.
Detecting crime patterns from Swahili newspapers
using text mining
George Matto* and Joseph Mwangoka
School of Computational
and Communication Science and Engineering,
The Nelson Mandela African Institution of Science and Technology,
P.O. Box 447, Arusha, Tanzania
Email: mattog@nm-aist.ac.tz
Email: josephwam@gmail.com
*Corresponding author
Abstract: The Tanzania Police Force, as many other law enforcement agencies
in developing countries, relies mostly on manual, personal judgments, and
other inadequate tools for analysis of data in its crime databases. This approach
is inadequate and prone to errors. Moreover, research shows that more than half
of all crimes committed in Tanzania are not reported to police and thus it is
likely that they are not analysed by the police. In this study, we use text mining
to extract crime patterns from sources of crime data outside police databases. In
fact, we use four daily published Swahili newspapers. With the help of our
developed patterns mining model we extracted several crimes reported in the
newspapers, we mapped the distribution of the mined crimes country-wide, and
with the use of FP-growth, we generated association rules between the mined
crimes. Results from this study will contribute to crime detection and
prevention strategies.
Keywords: crime; crime patterns; text mining; association rules; FP-growth.
Reference to this paper should be made as follows: Matto, G. and
Mwangoka, J. (2017) ‘Detecting crime patterns from Swahili newspapers using
text mining’, Int. J. Knowledge Engineering and Data Mining, Vol. 4, No. 2,
pp.145–156.
Biographical notes: George Matto is a PhD scholar at the School of
Computational and Communication Sciences and Engineering at the Nelson
Mandela Institution of Science and Technology. He holds MSc in Computer
Science and BSc (Hons) in Computer Science. His areas of research include
database management systems, data mining, text mining, pattern recognition
and big data.
Joseph Mwangoka is a Senior Lecturer at the Nelson Mandela African
Institution of Science and Technology, Arusha, Tanzania. He received his PhD
degree from the Tsinghua University, Beijing, China in 2009. Until 2012 he
was a Senior Research Engineer at the Institute of Telecommunications,
Aveiro, Portugal. His research interests include data science, cognitive radio
technology, dynamic spectrum management, ICT4D/E, health informatics, and
cloud computing. He has co-authored a number of peer-reviewed book
chapters, journal articles and conference proceedings.