Support Vector Machine Classifier for Sentiment Analysis of Feedback Marketplace with a Comparison Features at Aspect Level Hario Laskito Ardi Department of Information System Diponegoro University Semarang, Indonesia harios1si@gmail.com Eko Sediyono Department of Information System Diponegoro University Semarang, Indonesia ekosed1@yahoo.com Retno Kusumaningrum Department of Informatics Diponegoro University Semarang, Indonesia retno@if.undip.ac.id Abstract — Sentiment analysis is an interdisciplinary field between natural language processing, artificial intelligence and text mining. The main key of the sentiment analysis is the polarity that is meant by the sentiment is positive or negative (Chen, 2012). In this study using the method of classification support vector machine with the amount of data consumer reviews amounted to 648 data. The data obtained from consumer reviews from the marketplace with products sold is handpone. The results of this study get 3 aspects that indicate sentiment analysis on the marketplace aspects of service, delivery and products. The slang dictionary used for the normation process is 552 words slang. This study compares the characteristic analysis to obtain the best classification result, because classification accuracy is influenced by characteristic analysis process. The result of comparison value from characteristic analysis between n-gram and TF-IDF by using Support Vector Machine method found that Unigram has the highest accuracy value, with accuracy value 80,87%. The results of this study explain that in the case of analysis sentiment at the aspect level with the comparison of characteristics with the classification model of support vector machine found that the analysis model of unigram character and classification of support vector machine is the best model. Keywords - Sentiments Analysis, Features Extraction,N- Gram, TF-IDF, Support Vector Machine, Marketplace, Aspect. I. INTRODUCTION Sentiment analysis is a field of study that analyzes one's opinion, one's sentiments, one's evaluation, one's attitude and one's emotions into written language. The technique of sentiment analysis can support many decisions in many scenarios. This research uses two attribute class, that is positive and negative, because in internet the comments that appear can be positive and negative comments. Consumer interaction is considered a valuable source of information because people share and discuss their opinions about a particular topic freely. The classification method that is now widely developed and applied is Support Vector Machines (SVM). This method is rooted in the theory of stati5t6stical learning which results very promising to provide better results than other methods. Many researchers have reported that SVM is probably the most accurate method for text classification. It is also widely used in semtiminal analysis. Aspect level is also called level feature [1]. In addition to searching for language constructs, the more aspect levels look at the opinion side. It is based on that opinion consists of sentiment (positive or negative) and target (opinion). From this level it can be seen that the importance of the opinion target serves to make it easier to understand the problems in the sentiment analysis. For large data sets SVM requires enormous memory for the allocation of the kernel matrix used. SVM training methods that require large memory are chunking and decomposition [2]. In sentimental analysis, the commonly used feature is n-gram. In some literature, it can also be interpreted the emergence of a new meaning or word from a set of characters cut in a word. Text feature extraction is an important step in text classification. Feature extraction plays a role in determining which features will be used by which classification techniques and which features are ignored. The large number of features resulted in a large dimensionality word vector. The addition of relevant n-gram features can improve text classification performance. In addition to the extraction of n-gram features, feature weighting, which is weighting the feature according to its significance, is a step that can be explored to improve classification performance. Commonly used weighting, Term Frequency - Inverse Document Frequency (TF-IDF), only considers the frequency parameters of feature appearance in the document and the number of documents containing the feature. SVM also works well on high-dimensional data sets, even SVMs that use kernel techniques must map original data from their original dimensions to other relatively higher dimensions. The kernel problem makes it possible to define nonlinear decision limits,