International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 Index Copernicus Value (2013): 6.14 | Impact Factor (2015): 6.391 Volume 5 Issue 10, October 2016 www.ijsr.net Licensed Under Creative Commons Attribution CC BY Extraction of Aspects from Customer Reviews using Life Long Learning Mily Lal 1 , Akanksha Goel 2 1, 2 D. Y. Patil Institute of Engineering, Management and Research, Akurdi, Pune 44, India Abstract: Aspect extraction is a challenging task of opinion mining. This paper proposes a novel statistical approach to make a major improvement in sentiment analysis of customer reviews. The approach is based on identification of customer opinions along with the framework of lifelong learning. It is implemented with LSA and Association rule mining. Experimental results show the effectiveness of the proposed approach. Keywords: aspect extraction, opinion mining, opinion word, Sentiment analysis, Life Long Learning 1. Introduction The advancement of Web as well as its read-write nature has enabled more and more users to interact and share knowledge and information. Mining useful opinions and sentiments from the web became more challenging, because deep understanding of the semantic similarity as well as aspect association in the natural language is required[1]. This paper proposes to extract aspects from customer reviews under the framework of life- long machine learning 2. Related Work There are mainly two approaches for extracting aspects: supervised and unsupervised. Hu and Liu (2004) [3] first proposed a technique to extract product aspect based on association rule mining. Double propagation (Qiu et al., 2011) [11] further developed the idea. Jin et al. (2009a and 2009b) [13][14] utilized lexicalized HMM to extract product aspects and opinion expressions from reviews. Pang Lee et al. (2002, 2004) [2] [8] used supervised learning in sentiment analysis for determining whether it could be treated as a topic-based categorization with positive and negative topics. By using the Apriori algorithm, Hu and Liu [3][5][25] generated all strong association rules to extract implicit as well as explicit opinion features expressed in reviews. This paper proposes to use statistical approach of LSA (Latent semantic Analysis) as the base and improve its results dramatically through aspect recommendation. The recommendation proposed are semantic similarity-based, and aspect associations-based. [1] 3. The Algorithm 3.1 Extracting Base Aspects Stanford POS Tagger is used extract nouns/noun phrases as they mostly represent aspects. It works very well in medium size corpus. But for large corpora, this method may result in extracting many nouns/noun phrases which are not product aspects. The precision of the method plummets. SentiWordNet is used to determine the polarity of each modifier. SentiWordNet (SWN) is a lexical resource of sentiment information for terms in the English language introduced in [15] designed to assist in opinion mining tasks. Each synonymous set in SWN has a positive sentiment score, a negative sentiment score and an objectivity score. When the sum of these scores equals one, it indicates the relative strength of the positively, negativity and objectivity of each synonymous set. The drawback in using SWN is that it requires word sense disambiguation to find the correct sense of a word and its associated scores. Table 1: Pattern used for extracting Aspects 3.2 Finding Aspect Similarity Latent Semantic Analysis is the proposed method used to find the aspect similarity.LSA is a mathematical and statistical approach, claiming that semantic information can be derived from a word-document co-occurrence matrix and words and documents can be represented as points in a (high- dimensional) Euclidean space. Dimensionality reduction is an essential part of this derivation [20]. Values close to 1 represent very similar words while values close to 0 represent very dissimilar words. The terms with similarity score above a particular threshold is returned. 3.3 Associating Aspects The proposed work is designed specifically to identify aspects that do not occur explicitly in review sentences. Secondly, the approach discriminates between opinion words and aspect words i.e opinion words can only occur in the rule antecedents, while rule consequents must be opinion aspects Paper ID: ART20162039 150