Perception Analysis for University of San Carlos (USC) as an Educational Institution using Web Mining and Multinomial Naïve Bayes Algorithm Angie M. Ceniza University of San Carlos School of Arts and Sciences Department of Computer and Information Sciences Christian V. Maderazo University of San Carlos School of Arts and Sciences Department of Computer and Information Sciences Mary Jane G. Sabellano University of San Carlos School of Arts and Sciences Department of Computer and Information Sciences Abstract:- The use of blogs, forums, and other forms of media over the web in order to express one's perception is tremendously increasing that either endorses or criticizes the performance of an entity or organization. The aim of this research is to determine the public's perception of University of San Carlos (USC) as an academic institution specifically in the classification of positive, negative or neutral polarity through Sentiment Analysis. In this paper we explore the use of Web Mining as the process of harvesting hyperlink structure and Multinomial Naïve Bayes Algorithm to evaluate web content as positive, negative or neutral perceptions. The experiment results show that the approach used in the research achieves 52.92% precision and 92.88% recall. Keywords: Perceptions, Sentiment Analysis, Web Mining, Multinomial Naïve Bayes Algorithm 1. INTRODUCTION Perception analysis is a key in knowing one’s reputation. In an educational institution, it is an advantage to know their reputation through public sentiments. According to Pang and Lee [31] “What other people think” is an important piece of information during the decision making process. Administrators will be able to evaluate the performance of the university as provider of quality education. Sentiment analysis is a computational treatment of people’s opinions, attitudes and emotions towards an entity [25] . It helps individuals and organization in making decisions [31] . In gathering sentiments, web mining is one of the most conducive techniques since most of the people nowadays use different forms of media over the web as their means of expressing ideas. Researchers make use of this technique in gathering the information needed. Web data mining technique [23] aims to discover useful information or knowledge from the Web hyperlink structure, page content, and usage data. The Web mining process is similar to the data mining process. The difference is usually in the data collection. In traditional data mining, the data is often already collected and stored in a data warehouse. For Web mining, data collection can be a substantial task, especially for Web structure and content mining, which involves crawling a large number of target Webpages. The researchers gather all related information to University of San Carlos – Cebu into the web. Those “bag-of-words” that will be collected will be analyzed and evaluated by Multinomial Naïve Bayes classifier [35] . Medhat, Hassan and Korashy [25] identified the algorithm as one of the most frequently used for solving Sentiment classification problem. Bhadane, Dalal and Doshi [1] considered it as commonly used algorithm. For this reason, researchers decided to apply this algorithm in identifying the public’s perceptions of University of San Carlos as an academic institution. 2. REVIEW OF RELATED LITERATURE These are several lines of related work which are reviewed in this section. 2.1 Perception Analysis Most users publish personal messages about people, product, events and interest. These opinionated messages are available for study. The classification of user’s sentiments has been recognized as a significant research area. Sentiment Analysis has emerged as a rapidly expanding field of application and research in the area of information retrieval [6] . In the published work of Pang and Lee [29] they conducted a study of sentiment analysis that seeks to identify the viewpoint(s) underlying a text span. In the research of Pathak, Mane, Srivastava and Contractor [31] , they study about perception of knowledge in an organization through an email network. Iskender and Bati [13] conducted Sentiment Analysis in the context of International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 Published by, www.ijert.org ICIDB - 2015 Conference Proceedings Volume 4, Issue 01 Special Issue - 2016