INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 5, ISSUE 12, DECEMBER 2016 ISSN 2277-8616
64
IJSTR©2016
www.ijstr.org
A Methodology In Processing Descriptive
Analytics Using MMDA Traffic Update Tweets,
Tokenization And Classification Tree In
Discovering Knowledge
Tristan Jay P. Calaguas, Menchita F. Dumlao
Abstract: Traffic on National Capital Region of the Philippines is going as one of many problems facing by the local government and Filipino citizen who
are residing in Metro Manila. In addition, a Filipino citizen that is working in Metro Manila is experiencing a waste of Twenty – Eight Thousand hours in
traffic which results unproductivity. Due to traffic that causes long commutes it take away an individual from exercise activities that results fatigue in their
health. In relation with this, due to lack of exercise that causing by the traffic, each year, One Hundred Seventy Thousand Filipinos die from
cardiovascular diseases up from Eighty Five Thousand more than Twenty years ago, according to 2009 study by the Department of Health (DOH).
Population increase is one of many causes of traffic in Metro Manila. As population is growing, the more car riders and commuters volume will be in the
road including delivery trucks, Pedi cabs, jeeps, and provincial buses that signify that there is a high employment rate in the country that causes traffic.
However, to sustain the public needs, MMDA is the government agency that provides public services to Filipino citizens through providing updated public
traffic information. For past years, MMDA used Telephony lines and Television Broadcasting for traffic information dissemination, which is very costly in
maintenance that made them to adopt Twitter to post Traffic updates and advisories to the public .Since, this government agency uses Twitter in
disseminating information through posting tweet, there is a need for a methodology on how these tweets will analyze so that citizens will have an insight
in decision making to avoid specific time of traffic in metro manila. From this condition, the researcher will adopt the use of MMDA tweets as the primary
data source and apply the CRISP as the knowledge discovery standard processes that to be used in building methodology for descriptive analytics. In
this experimental research several processes were used to convert the semi structured MMDA tweets into structured data matrix. SQL was used for
storing, retrieving and pattern matching, while PHP string functions were used to tokenize the tweet and transform it into array so that the tokens can
store in database using iterative structure. After loading all token to its specific table we abled to have a data matrix that comprised of time, routed roads,
traffic status and day information that was used in data mining to discover knowledge. Lastly we used J48 Classification Algorithm to classify the time
usually the traffic happens in many routed roads from NCR. As the result we discovered that from Eight O‘clock to Nine Forty One in the morning the
commuters are experiencing a traffic and from One O ‘Clock in the afternoon to Eight O‘Clock in the evening the commuters are also experiencing a
traffic in C5 North Bound to South Bound and Edsa North Bound to South Bound every Tuesday and Friday with the accuracy of 75.72%.
Index Terms: tokenization, classification, tweets, traffic, methodology, knowledge discovery, update traffic
————————————————————
1 INTRODUCTION
Traffic on National Capital Region of the Philippines is going
as one of many problems facing by the local government and
Filipino citizen who are residing in Metro Manila. In addition, a
Filipino citizen that is working in Metro Manila is experiencing
a waste of Twenty – Eight Thousand hours in traffic which
results unproductivity [1]. Due to traffic that causes long
commutes [2] can take away from exercise times that results
fatigue to individual [3]. In relation with this, due to lack of
exercise that causing by the traffic, each year, One Hundred
Seventy Thousand Filipinos die from cardiovascular diseases
up from Eighty Five Thousand more than Twenty years ago,
according to 2009 study by the Department of Health (DOH)
[4].
Population increase is one of many causes of traffic in Metro
Manila. As population is growing, the more car riders and
commuters volume will be in the road including delivery trucks,
Pedi cabs, jeeps, and provincial buses that signify that there is
a high employment rate in the country that causes traffic [5].
However, to sustain the public needs, MMDA is the
government agency that provides public services to Filipino
citizens through providing updated public traffic information.
For past years, MMDA used Telephony lines and Television
Broadcasting for traffic information dissemination, which is
very costly in maintenance that made them to adopt Twitter to
post Traffic updates and advisories to the public [6]. Since, this
government agency uses Twitter in disseminating information
through posting tweet, there is a need for a methodology on
how these tweets will analyze so that citizens will have an
insight [7] in decision making to avoid specific time of traffic in
metro manila. From this condition, the researcher will adopt
the use of MMDA tweets as the primary data source and apply
the CRISP as the knowledge discovery standard processes
that to be used in building methodology for descriptive
analytics.
2 RELATED WORKS
Organization will have an insight through dealing with huge
amount of information by performing the four tasks of
knowledge discovery processes. These are the data collection,
data cleaning, data analysis, and data application [8]. There
are many ways from traditional methods on how they can
collect data for decision making such as surveys, observation,
and interviewing but the problem from this traditional method
___________________________
Tristan Jay P. Calaguas is currently pursuing Doctorate
degree program in Information Technology in AMA
University, Philippines, and currently an Information
Technology Faculty Member in The Philippines Womens
University. E-mail: calaguas26@yahoo.com
Dr. Menchita F. Dumlao is currently the Information
Technology Chair Woman in Information Technology and
Research Director in The Philippines Women’s
University, Philippines